349
Now that you've all tried it ... ChatGPT web traffic falls 10%
(www.theregister.com)
This is a most excellent place for technology news and articles.
Look into llama.cpp - it's a single C++ program that run quantified models (basically models with some less precision - don't need a full 64 bits for a double, really). As for models to run on it, there's so many but I think WizardLM is pretty good.