133
Menlo park, CA (upload.wikimedia.org)
submitted 1 year ago by egeres@lemmy.ml to c/fuck_cars@lemmy.ml
[-] egeres@lemmy.ml 3 points 1 year ago

I can't believe I'll get excited about phone specs again ๐Ÿ™Œ๐Ÿปโœจ. Do you think it could be possible to parallelize computation among various phones to run inference on transformer models? I assume is not worth it since you would need to transfer a ton of data among devices to run attention per layer, but the llama people have pulled so many tricks at this point...

egeres

joined 1 year ago