594
Chatbots Make Terrible Doctors, New Study Finds
(www.404media.co)
This is a most excellent place for technology news and articles.
Funny how people over look that bit enroute to dunk on LLMs.
If anything, that 90% result supports the idea that Garbage In = Garbage Out. I imagine a properly used domain-tuned medical model with structured inputs could exceed those results in some diagnostic settings (task-dependent).
Iirc, the 2024 Nobel prize in chemistry was won on the basis of using ML expert system to investigate protein folding. ML =! LLM but at the same time, let's not throw the baby out with the bathwater.
EDIT: for the lulz, I posted my above comment in my locally hosted bespoke llm. It politely called my bullshit out (Alpha fold is technically not an expert system, I didn't cite my source for Med-Palm 2 claims). As shown, not all llm are tuned sycophantic yes man; there might be a sliver of hope yet lol.
The statement contains a mix of plausible claims and minor logical inconsistencies. The core idea—that expert systems using ML can outperform simple LLMs in specific tasks—is reasonable.
However, the claim that "a properly used expert system LLM (Med-PALM-2) is even better than 90% accurate in differentials" is unsupported by the provided context and overreaches from the general "Garbage In = Garbage Out" principle.
Additionally, the assertion that the 2024 Nobel Prize in Chemistry was won "on the basis of using ML expert system to investigate protein folding" is factually incorrect; the prize was awarded for AI-assisted protein folding prediction, not an ML expert system per se.
Confidence: medium | Source: Mixed