I replaced Google Home with Home Assistant and a local LLM, and I'm not looking back (www.xda-developers.com)

submitted 1 week ago by cm0002@lemy.lol to c/smarthomes@feddit.uk

11 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[-] red_bull_of_juarez@lemmy.dbzer0.com 9 points 1 week ago

I am most interested in how to set up and integrate the local LLM for this, particularly the hardware requirements. Sadly, the article doesn't have any details on that.

[-] ragebutt@lemmy.dbzer0.com 7 points 1 week ago

https://community.home-assistant.io/t/local-llm-for-dummies/769407 For the llm part

https://www.home-assistant.io/voice_control/voice_remote_local_assistant/ For the speech transcription part

VRAM is more important. If the model can fit in vram a slower gpu is workable. Gpu determines speed of token generation but if model is too big for vram and is partially offloaded to ram/cpu performance drops considerably. A 13 billion parameter llm will run significantly better on a slower gpu with 12gb vram than a faster gpu with 6gb. 7B models are pretty capable (6-8gb vram) but going to 13b (8-10gb) or 32b (20+ gb) are each a notably better improvement in capability though 32b is largely impractical for this use case (unless you don’t mind dropping $1300-$2500 on a 24gb 3090 or 4090 and paying to power it)

[-] red_bull_of_juarez@lemmy.dbzer0.com 3 points 1 week ago

Thanks. I saved your comment to look into this. Running a local Alexa is a dream of mine.

[-] terranoid@lemmy.cafe 4 points 1 week ago* (last edited 1 week ago)

Local LLM hardware requirements highly depends on the model. You should try out Ollama and see which models work well on whatever hardware you're testing it on.

You will want to look up what quantized models are too.

[-] andrew0@lemmy.dbzer0.com 5 points 1 week ago

Don't use Ollama. If you're on Windows, better try LM Studio and look into HuggingFace models rather than relying on the Ollama repository. Much better.

[-] dontbelievethis@sh.itjust.works 2 points 1 week ago

Why is it better?

[-] andrew0@lemmy.dbzer0.com 3 points 1 week ago* (last edited 1 week ago)

Dubious open-source practices on the part of Ollama devs. Other than that, LM Studio is using the latest stable llama.cpp rather than the one developed by Ollama, which brings significant speed improvements. You also have a better understanding of what model you're deploying by not using Ollama, and instead looking into the HF repository. For example, Ollama states that they're serving DeepSeek-R1, but pulling this one gives you a distilled 8B billion version that is not actually the DeepSeek-R1 (671B parameters) that one would have expected.

I get it that it might make it easier to use, but you will not learn much by using it. Even worse, competition is even better with performance and similar out-of-the-box capabilities.

this post was submitted on 07 Jun 2026

79 points (100.0% liked)

Smart Homes

866 readers

1 users here now

For the discussion of smart homes, home automation and the like. Because of the instance it will tend to have a more UK flavour but everyone is welcome.

Elsewhere in the Fediverse:

Rules:

Be excellent to each other

NB: looking for moderators.

founded 2 years ago

MODERATORS

Emperor@feddit.uk