Understandable! Apologies if I came across overly chastising towards you specifically.
Instent vs impact is pretty hard in this case. Part of my response is from conversations with friends in STEM fields and the impacts of the male centric nature of the space (comp sci especially) has on them. Especially with how much men self-reinforce that position. It truly is an exclusionary space for them.
I hadn't read as many comments in this thread yet and there are some well thought out discussions here too, which I'm glad to read.
That's the solution I take. I use Proxmox for a Windows VM which runs Ollama. That VM can then be used for gaming in the off chance a LLM isn't loaded. It usually is. I use only one 3090 due to the power load of my two servers on top of my [many] HDDs. The extra load of 2 isn't something I want to worry about.
I point to that machine through LiteLLM* which is then accessed through nginx which allows only Local IPs. Those two are in a different VM that hosts most of my docker containers.
*I found using Ollama and Open WebUI causes the model to get unloaded since they send slightly different calls. LiteLLM reduces that variance.