Don't Claude Me (dialecticaldispatches.substack.com)

submitted 2 days ago by yogthos@lemmy.ml to c/technology@lemmy.ml

4 comments fedilink hide all child comments

top 4 comments

sorted by: hot top controversial new old

[-] hoshikarakitaridia@lemmy.world 3 points 2 days ago

While the observations are true, the characterizations of this article are completely wrong.

What's plausible is that AI genuinely changes their information based not only on what you speak but how you speak.

LLMs work in associative thinking patterns. People who speak in a similar way often know about the same of specific topics. And because AIs are lords of the common and average, these broad stroke patterns are just regurgitated back at us.

It's just like racism in policing: black people often land in prison. And a part of that is racism on it's face: police think less of black people.

But another big part is obscured systemic racism: if you're less educated or more poor, you have a higher chance of doing criminal things. And black people generally have less access to good education or wealth. It's not causality, but it's an indicator and a noticeable and patternized correlation.

And I think this is exactly what we see here. The AI hasn't specifically been trained to be classist and racist, but it's just throwing those patterns back at us and finally visualizing underlying classism and racism in our real world.

AIs sure do a lot of bad, but in this case, the bad thing already happened before AI became involved. At least that's my humble opinion.

[-] yogthos@lemmy.ml 3 points 2 days ago

Yes, the model reflects the biases already baked into the training data., and the pidgin example is almost certainly the model regurgitating classist, racist patterns from its corpus, not a developer explicitly telling it to mock villagers. However, the broader point here is reagarding systemic inequality showing up in AI output.

The intentional claim is based on the fact that Claude straight up refused to answer certain factual questions for users who identified as Iranian or Russian, while cheerfully answering the same questions for Americans. That can't be hand waved away as a statistical correlation between dialect and knowledge. That's a hard refusal trigger almost certainly put there by safety/alignment tuning, RLHF filters, or some geopolitical compliance rules nobody knows about. Someone decided that users from those countries shouldn't get those answers.

So there are two different things happening. One is that the model has passive bias where it learns toxic associations from training data. But the other is active gating where the model is instructed, directly or indirectly, to withhold information based on user demographics. The refusal case clearly shows that there is deliberate choice in whom the model will give answers to.

And the most important aspect of all this is that we cannot reliably know what the reason for a particular behavior is because closed models make it impossible to tell which mechanism is at work. Hence why open and inspectable models are the only way to audit this stuff. The prescription of openness and local control makes sense regardless of whether the harm is passive or active.

[-] krolden@lemmy.ml 1 points 1 day ago

You're not supposed to be asking llms for medical advice. This article feels like it's encouraging that behavior

[-] yogthos@lemmy.ml 2 points 1 day ago

Not really, I just used an example of the kind of fuckery that would be possible given that people do ask llms for medical advise. Whether they should or not is a separate question.

this post was submitted on 05 Jun 2026

11 points (100.0% liked)

Technology

42682 readers

208 users here now

This is the official technology community of Lemmy.ml for all news related to creation and use of technology, and to facilitate civil, meaningful discussion around it.

Ask in DM before posting product reviews or ads. All such posts otherwise are subject to removal.

Rules:

1: All Lemmy rules apply

2: Do not post low effort posts

3: NEVER post naziped*gore stuff

4: Always post article URLs or their archived version URLs as sources, NOT screenshots. Help the blind users.

5: personal rants of Big Tech CEOs like Elon Musk are unwelcome (does not include posts about their companies affecting wide range of people)

6: no advertisement posts unless verified as legitimate and non-exploitative/non-consumerist

7: crypto related posts, unless essential, are disallowed

founded 7 years ago

MODERATORS

MinutePhrase@lemmy.ml