submitted 5 months ago by Lugh@futurology.today to c/futurology@futurology.today

25 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[-] DavidGarcia@feddit.nl 6 points 5 months ago

For the small ones, with GPUs a couple hundred watts when generating. For the large ones, somewhere between 10 to 100 times that.

With specialty hardware maybe 10x less.

[-] pennomi@lemmy.world 3 points 5 months ago

A lot of the smaller LLMs don’t require GPU at all - they run just fine on a normal consumer CPU.

[-] copygirl 3 points 5 months ago

Wouldn't running on a CPU (while possible) make it less energy efficient, though?

[-] pennomi@lemmy.world 3 points 5 months ago

It depends. A lot of LLMs are memory-constrained. If you’re constantly thrashing the GPU memory it can be both slower and less efficient.

[-] DavidGarcia@feddit.nl 1 points 5 months ago

yeah but 10x slower, at speeds that just don't work for many use cases. When you compare energy consumption per token, there isn't much difference.

[-] kippinitreal@lemmy.world 2 points 5 months ago

Good god. Thanks for the info.

this post was submitted on 01 Dec 2024

49 points (100.0% liked)

Futurology

2582 readers

78 users here now

founded 2 years ago

MODERATORS