79
submitted 4 days ago by cm0002@lemy.lol to c/smarthomes@feddit.uk
you are viewing a single comment's thread
view the rest of the comments
[-] ragebutt@lemmy.dbzer0.com 7 points 4 days ago

https://community.home-assistant.io/t/local-llm-for-dummies/769407 For the llm part

https://www.home-assistant.io/voice_control/voice_remote_local_assistant/ For the speech transcription part

VRAM is more important. If the model can fit in vram a slower gpu is workable. Gpu determines speed of token generation but if model is too big for vram and is partially offloaded to ram/cpu performance drops considerably. A 13 billion parameter llm will run significantly better on a slower gpu with 12gb vram than a faster gpu with 6gb. 7B models are pretty capable (6-8gb vram) but going to 13b (8-10gb) or 32b (20+ gb) are each a notably better improvement in capability though 32b is largely impractical for this use case (unless you don’t mind dropping $1300-$2500 on a 24gb 3090 or 4090 and paying to power it)

Thanks. I saved your comment to look into this. Running a local Alexa is a dream of mine.

this post was submitted on 07 Jun 2026
79 points (100.0% liked)

Smart Homes

866 readers
1 users here now

For the discussion of smart homes, home automation and the like. Because of the instance it will tend to have a more UK flavour but everyone is welcome.

Elsewhere in the Fediverse:

Rules:

NB: looking for moderators.

founded 2 years ago
MODERATORS