13

Conducting deep web searches and gathering sources is one of the main things I've been using LLMs for. How far away are we from being able to self-host something like Claude's web search capabilities? Or even just a service where I'd pay with my money instead of my data?

you are viewing a single comment's thread
view the rest of the comments
[-] TropicalDingdong@lemmy.world 5 points 2 days ago

Do you have a walk through for setup?

I'm on the strix halo 128 gb variant and while I got ollama working fine, i haven't gotten any of these multi headed setups working

[-] vapeloki@lemmy.world 5 points 2 days ago

I am on Gentoo for it, but everything with a decent rocm should work.

Have a look for llama-swap, that handles multi head endpoints.

Also, as you are on a big board, you can quantize yourself, as the BF16 version of qwen has only 72gb.

I will try and post a full writeup next days. But feel free to dm me, if you need some guidance on quantize or more.

I am using this fork currently: https://github.com/charlie12345/ROCmFPX

Stuff happens fast currently, so may be worth to wait a week or two ig you need something super stable, but if you are up for experimenting, that's the way to go

[-] TropicalDingdong@lemmy.world 3 points 2 days ago

THis is great, thanks. I'm on the z-13 and needed to use it for a work project, which is wrapping up soon. I'm planning on re-building it as a locally hosted agent support machine.

[-] Shimitar@downonthestreet.eu 2 points 2 days ago

Great man! Gentoo lover and long time addicted here.... Keep it the good work!

this post was submitted on 21 Jun 2026
13 points (100.0% liked)

Self Hosted - Self-hosting your services.

20016 readers
4 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules

Important

Cross-posting

If you see a rule-breaker please DM the mods!

founded 5 years ago
MODERATORS