87
LLMs have a strong bias against use of African American English
(arstechnica.com)
This is a most excellent place for technology news and articles.
Warning: I've edited the comment that you're replying to. I'm saying this for the sake of transparency, as you're clearly quoting the earlier version.
The key here is that AAVE is not written, but AAE is. That "V" is for vernacular, it excludes written English by definition.
Now, I'm not sure if those white kids are using AAE or simply borrowing things from AAE into their written English. I simply don't have data on that.
Varieties merging or splitting is rarely the result of just more contact between people; it's all about identity. If things are happening as you described them, it's simply that those white kids stopped seeing black people as "the others", to see them as "part of the same group as us".
Yeah. But most people "write" online like they speak...
https://commonwealthtimes.org/2021/02/18/aave-is-not-your-internet-slang-it-is-black-culture/
If people followed rules about language, yeah, vernacular would just be spoken speech. But that's not how it works. The rules are made to reflect what people are doing. The rules don't control what people do.
So yes, while the word vernacular commonly meant only spoken words, there ain't nothing stopping nobody from typing like they speak.
And people been doing it for a long time
That's a common misconception.
While your written and spoken varieties do interact a fair bit, no, people don't "write like they speak". Not even online.
And that is not simply an "ackshyually". A lot of AAVE features simply don't transpose into writing - like prosody, non-rhoticity, /ɪ/-breaking, /äɪ/-monophtongisation... at most you can consciously approximate them into writing, but they won't be there.
That is not about people following/not following "rules", it's about nomenclature - it's exactly the reason why "AAE" and "AAVE" are necessary as separated terms.
....
A lot of the difficulty older white people have with it, is it's spelled phonetically to maintain those things.
I gave you a link, lots of people have talked about this, it's not just some idea I came up with.
You're still talking like language has to follow the rules.
That's backwards. The rules change to follow the language
Ain't you old enough to have heard "ain't ain't a word because it ain't in the dictionary"?
Well, now it is.
And now the dictionary lists "figuratively" as one of the definitions for "literally".
Insist on following rules, and the dictionary wouldn't update.
I don't know how to put it anymore plainly, I'm sorry if you still don't understand
That is clearly false. Refer to what I said in the very comment that you're replying to: "That is not about people following/not following “rules”, it’s about nomenclature"
Please stop misrepresenting what I said.
You're implying that I claimed that you came up with this. I did not.
The link does not contradict what I said. It's simply using a different nomenclature, using the acronym "AAVE" to the whole instead of strictly the vernacular varieties.
The informative content there (i.e. beyond definitions) is mostly accurate, but contrariwise to what you're implying, I am not contradicting it.
Emphasis mine. Drop off the passive aggressiveness; the one here not understanding shit is you, as shown by the fact that you're consistently distorting what I said.
I'm not bothering further with you. Go put words on someone else's mouth.
More and more people are using speech to text. And it does show how differently people speak than write (apparently I never say my be in because, for example).
But it also means that llms aren't only being fed text, but also speech converted into text.
For me it's like "holy fuck... do I eat so fucking many vowels???" It reaches a point that I eventually gave up using text-to-speech with Portuguese in my cell phone, I go straight for Italian because at least then it gets me right.
That might be part of the issue causing the bias shown in the article.