AI companies are learning an ironic lesson as the people they pay to improve their chatbots are just feeding AI slop into them : technology

[-] OctopusNemeses@lemmy.world 9 points 4 hours ago* (last edited 4 hours ago)

Humans are error prone. That goes for both sides of these jobs. I mean the engineers who run these projects.

It's important to not get carried away with the allure of the tech industry. Especially the LLM hype. The people making these LLM models are human too. They're not unicorn tech wizards.

I've seen projects where the examples they gave us on what to submit were AI slop. They did not notice. By far the most common error with them is unclear and constantly changing guidelines. I've seen projects where their training videos were made by someone whispering nervously into the microphone. We had to crank the volume to hear them stumble over their words while trying to explain the project.

Ultimately most of these jobs exist to harvest data for projects that aren't that important. Forget about AGI or whatever. Think more along the lines of your weekend project. There's investor money right now so they have to use it.

They won't be paying people (read: impoverished third world countries) more than a few dollars an hour to grind out mountains of training data to feed into models. They're not chasing unicorns here. It's just slop generating LLMs. There's investor money so they have to use it. Why would they split more of the loot with clickworker tier peons.

Here's a bonus anecdote. One of the projects showed us literal shit in their training materials. A wet turd. I think it must have been a disgruntled employee. After hearing about how much Facebook employees hate working in the AI division, I think it must have been.

I really doubt the typical tech worker has that much conviction in LLMs themselves. It's just what's in style right now and it's what's getting them richer.

[-] iconic_admin@lemmy.world 71 points 21 hours ago

They’re just using AI to do their job more efficiently, what’s the problem?

[-] yboutros@infosec.pub 12 points 13 hours ago

At least with where AI is now, it's basically an incredible data compressor. My local copy of gemm4:31b takes up less than 50GB iirc, and I can retrieve information from the entire Internet with it without an internet connection.

Using an AI to train an AI is like taking a jpg of a jpg. You're going to lose information eventually. Hallucinations will become worse like in a game of telephone

[-] emeralddawn45@lemmy.dbzer0.com 22 points 12 hours ago

I mean sure, you can 'retrieve information', with no way of knowing where the information came from, whether the source was accurate, or whether what you've retrieved is even remotely faithful to the source material. So basically you can't actually retrieve anything, because it's just mashing words together in a way that happens to sound correct most of the time.

[-] isVeryLoud@lemmy.ca 4 points 6 hours ago

That's the cost of compression. You lose the source material, so it may hallucinate a bit.

[-] MathiasTCK@lemmy.world 6 points 10 hours ago

You can ask it to provide a source.

Sometimes it will.

Sometimes it will make one up.

[-] nulluser@lemmy.world 2 points 4 hours ago* (last edited 4 hours ago)

It will always make one up. Sometimes it may get lucky and the made up one exists and is relevant, but it still just made it up.

[-] zbyte64@awful.systems 12 points 13 hours ago

Lossy compression with no internal mechanism for detecting information corruption.

[-] cecinestpasunecommunication@lemmy.dbzer0.com 3 points 7 hours ago

We did it! Progress! Let's go back and show Claude Shannon!

[-] Flower@sh.itjust.works 119 points 1 day ago

"If these companies want quality data, then they should offer quality contracts," Alice continued. "Instead they're low-balling struggling people, employing them for the barest possible amount of time and tossing them aside as projects are finished with no warning."

Pay peanuts, get monkeys. Or minimum viable product for that price range, if you want to put it more fancy.

[-] LemmyFeed@lemmy.dbzer0.com 18 points 14 hours ago

You get what you pay for.

[-] m0darn@lemmy.ca 10 points 10 hours ago

My dad always corrects me to: you never get more than you pay for.

[-] viral.vegabond@piefed.social 144 points 1 day ago

Good. Poison the well, fuck this toxic industry. Burst the bubble.

[-] Lucidlethargy@sh.itjust.works 8 points 7 hours ago

Here fucking here.

Fuck AI.

[-] peripheralneuropathy@lemmy.world 3 points 7 hours ago

As a user that agreed to AI training data off my convos, I am doing my part.

[-] explodicle@sh.itjust.works 1 points 24 minutes ago

That's fine, so long as your part involves poisoning the data purple monkey dishwasher.

[-] forkDestroyer@infosec.pub 2 points 13 hours ago

Be the change and sign up for one of those services to help teach AI

[-] schmorpel@slrpnk.net 72 points 1 day ago

This is very important. I'm in a situation where the work I used to do is supposed to be taken over by the shitrobot. On a job platform now 15% of the job offers are the original work, 85% is AI slop fixing in one way or another. This led me on a 2-year odyssey of trying manual work (too weak), being really poor (getting better at that), and finally deciding that if I'm forced to serve the shitrobot to avoid starving I'll serve it badly. Btw so far I've managed to avoid these jobs, may it remain so.

That said, if you are in need of a real human translator for tech or creative EN-DE projects do contact me, I'd be glad to keep doing work that makes sense!

[-] DupaCycki@lemmy.world 7 points 7 hours ago

For some time I worked for Microsoft on their Copilot AI (outsource, I did not know I was signing up to work for Microsoft).

I cannot tell you how many good translators Microsoft hires just to rate Copilot's responses in various languages. They often had interesting insights to share, but whenever I asked my managers where to put it, they said we don't care. It's just rated 1 to 5, and that's it. Nobody even cares about language-specific nuances.

Technically none of the translators are ever 'hired', let alone 'employed'. All the work is based on shit contracts through several proxies. Some people weren't even sure that they were doing work for Microsoft. Hell, as a junior manager even I never had contact with anybody from Microsoft. Only our seniors.

It's an overall incredibly depressing environment. Very knowledgeable and passionate people still try to do their jobs as well as possible and provide insightful feedback, despite the fact it's supposed to completely replace them. Only to be ignored and ghosted when a given language is deemed not worth the cost by microslop.

I was literally the only person who ever responded to translators labeled as no longer useful. Even though they still had their contracts active. And I know that, because several of them told me that. Nobody else bothered to even respond as people asked for any available jobs when struggling to make a living.

[-] ChickenLadyLovesLife@lemmy.world 14 points 16 hours ago

I went from mobile apps programmer to school bus driver. 100X happier even if I make 1/6 what I used to.

[-] Franconian_Nomad@feddit.org 10 points 17 hours ago

I saw a talk recently of somebody translating manuals for new medical devices. She said translating software is not helpful because it’s a very specific field and the devices are brand new. Maybe give it a try.

[-] halfapage@lemmy.world 35 points 1 day ago* (last edited 20 hours ago)

I absolutely despise morons who smugly pronounce language learning and translation work "solved", while at the same time not bothering to learn any language beside their native one. And most often not bothering to use that one well, as well. You can tell so easily they have no idea what they are missing out on.

I hope it's all going to end in style of tower of Babel event. I know that it won't, but hey.

Wish you the best for your field of work.

[-] schmorpel@slrpnk.net 14 points 1 day ago

Language is in a peculiar decline these days - there's the process of English becoming the most badly spoken and written language ever, because all of us non-natives use it online and often also at work. Together with the inescapable avalanche of slop being churned out.

Also, language used to carry authority and this is getting lost for more and more people. We have been bombarded with advertising, propaganda, lies for many generations now and it's becoming stale. Longer texts used to carry more authority, now a topic can be communicated very precisely through a meme, and why not? For a translator I am getting awfully distrustful of words I'm afraid. I believe we are already standing right under the crumbling tower and will have to learn to communicate through shrugs and grunts. And again, why not?

[-] ramble81@lemmy.zip 10 points 23 hours ago

language used to carry authority

That’s an interesting view, because one way I always looked at it was it became a gating function (in a negative way). Just like the rich raise the barrier to entry, I always thought that there were people who were dismissive of others because you couldn’t speak their language perfectly.

Coupled with the hundreds of unique languages (let alone dialects) it created artificial pockets and barriers of understanding and power.

I do understand some of the cultural nuances of specific languages, but overall having a single common language understood and used by everyone can help unite us globally, rather than keeping us siloed.

[-] schmorpel@slrpnk.net 2 points 5 hours ago

That’s an interesting view, because one way I always looked at it was it became a gating function (in a negative way). Just like the rich raise the barrier to entry, I always thought that there were people who were dismissive of others because you couldn’t speak their language perfectly.

I think that's more or less the same idea as language carrying an authority. You can only use language for gating an ivory tower if the plebs believe your expensive terminology describes real and relevant facts. I think an insider language that doesn't carry this authority gets called other names, slang maybe? Also used for gating, just not as a barrier towards rising towards a higher status position in academia or rich circles.

English and the internet have this potential of bringing people together, it's quite powerful. You suddenly find out how your situation relates to people on other continents. I remember that before there was a much stronger feeling of 'other' towards people from other countries and cultures, and often the only information you could get about these others would be through the eyes of someone else. To be honest, even if the powers that be fuck up the internet beyond recognition now, that's a kind of devil difficult to stuff back into the box.

[-] kolmaskommentoija@sopuli.xyz 5 points 22 hours ago

I absolutely despise morons who smugly pronounce language learning and translation work “solved”

Juu sammoo miäkkii uonny uatelna. Kaekkee hienoenta tämmöttiissä uonku eip nuo alakoritmit ja semmottet oekkeest uo mittee ees ratkassukkaa. Miä eilispäevänä justiinsa opinni etteep tekoviksut ossoo ees kunnolla murutehia kientöö, vaek kyl miä nii luulinni juu. Tämmöttii kup vähäsennii huastelloopi ni eip hyö siihe oekkee mittee osannukkaa virikata! Kuukkels tuo etteenki se suols aenaki iha pelekköö paskoo, ol ihap hauskoo kyl lukkoo ja naaraa.

[-] CovertOperative@piefed.zip 10 points 18 hours ago* (last edited 18 hours ago)

Is this supposed to be a meta comment to show that machine translation can't help us read it?

Edit: If it is, well DeepL manages a pretty coherent translation:

Yeah, I feel the same way. All the fancy stuff in this field—those sub-rhythms and whatnot—aren’t really a solution at all. Just yesterday I learned that even the pros can’t always get the grammar right, even though I thought they could. I messed around with this a little, but I couldn’t really figure out how to make it work! That part at the beginning sounds like total crap, but it’s actually pretty funny to listen to and watch.

I'm guessing "sub-rhythm" should be "algorithm" and "pros" probably means software and not people. The last sentences could use some more context. But otherwise this sounds kinda logical.

Now Google Translate…

Yeah, I'm so sorry. All the fine things in this world, but those little things and the like, don't solve the problem. Yesterday, I just didn't study properly, I thought so. That's a little bit of a huastelloopi, and it's not good for you, but it's not a big deal! Kuukkels, that's why it's always fun to play with the balls, it's just fun to play with the balls and the girls.

…yeah.

[-] kolmaskommentoija@sopuli.xyz 4 points 9 hours ago* (last edited 8 hours ago)

Yes.

Yeah I've been thinking the same. The greatest thing about these is that those algorithms and whatnot haven't really even solved anything. Just yesterday I learned that fake/artificial-smarts can't even translate dialects correctly, even though I thought they could. If you talk something like this for a bit they weren't really able to answer that! Google especially was giving out complete shit, but it was pretty funny to read and laugh.

If it was unclear, the point is: pick a random finn from the street and they can translate that pretty much from word to word, even if they are from a complete different dialect speaking area, whereas even at best AI could give you only something towards it. I can only use obscure things, like this as an example, as I do not speak that many other languages, but if the languages do not have much written record online, they are not going to be properly translatable. We are still surprisingly far from not needing human translators.

//And yes, Google was hilariously shit. I managed to make couple normal sentences, without even trying, that it just gave up completely and did not translate at all, only removed some random letters.

[-] kolmaskommentoija@sopuli.xyz 3 points 3 hours ago

Actually lets break it down, so it is clearer what the accuracy was. I will not talk about the mistranslations, though.

I have come to the same conclusion as the previous poster. DeepL identifies correctly I agree with them, but fails to pick up the nuance of it. Pretty good.
I am indirectly taking part in mocking techbros for thinking AI has solved language learning. This is referencing the previous post, so DeepL could not know that without context. It somewhat picks up I am saying AI-stuff has not solved anything.
It correctly picks up I learned something about language yesterday and that someone fails at it, but it fails to identify I am talking specifically about dialects and fails to clearly convey it is AI that fails. It translates correctly I mistakenly thought the previous thing was true.
I am saying AI could not properly answer to talking in dialect, referencing indirectly I am talking in dialect in the message. DeepL picks up that I am saying something is failing, but does not convey anything else correctly.
I'm telling Google was the worst at translating, and that I found that hilarious. DeepL fully fails to translate the meaning, but translates the word "shit" acceptably, and conveys correctly something is funny.

So what was lost in translation?
Talking in dialect, and AI failing to translate dialects properly - Core part of the message, so really bad, that it was about dialects, was not conveyed.
I am laughing at Googles translation abilities being the worst - Fails to convey this completely. Not a core part of the message, but still relatively important information.
Nuance about thinking before agreeing - Leaving that out does not matter in casual conversation. If this was translation for a more "proper" thing, this could be bad though.
Mocking techbros - This required context that wasn't offered.

[-] CovertOperative@piefed.zip 1 points 1 hour ago

Okay, so it was in dialect. I honestly would not have expected translation programs to be able to do that at all. Or are Finnish dialects actually written language? In German they aren't, but there are books written in a phonetical way in dialect, so there may be something for the AI to reference.

[-] kolmaskommentoija@sopuli.xyz 2 points 1 hour ago* (last edited 35 minutes ago)

No, they are not usually written except people sometimes doing it casually in social media, but because our lettering system is almost fully phonetic, it is very easy to write and read them if you just speak finnish. Also if you are native speaker, you kind of learn the certain fluidity in the core of the language, so you can pretty much understand the words even if they vary a lot (except people from Rauma, nobody understands them).

I really thought it would have been cracked by AI because it can translate finnish pretty accurately (not always...) and if you can do that dialects aren't hard at all, but I was surprised to find that it still cannot! I am assuming it really is just because there are not enough written sources to teach from.

//Oh, and as a summary my main points were, that AI most definitely has not "solved" learning and translating languages as it yet cannot even translate a lot of things, and I guess also, that you cannot trust AI translations if the text translated is some obscure language you do not know. They can sound convincing and form coherent sentences, but the meaning can be fully incorrect.

[-] LeSparrow@piefed.social 2 points 3 hours ago

I was just at a networking/research technology conference in Helsinki (TNC26) where the topic of nordic languages— especially minority ones—being under-represented by current automated transcription/translation tools came up in one of the side talks I attended. There's some effort by various European NRENs and universities to train models on these languages so those tools can be more widely available to students, academics, and the public. The talk was about "Scribe" by SUNET (Swedish Research Network) hosting whisper models for this purpose.

That said, I do believe that learning a language by studying, immersion in the culture, and actually having conversations with people who speak it natively is the only way to really experience another language. There's always something lost in translation if you can't internalize a language by living it. In some ways language is one of the parts of the human experience that's unique and irreproducible by LLMs (despite the name). Language is more than rote communication of information; it conveys ideas, emotions, the weight of memory and history.

Also, Finnish is fucking hard lol. I can usually pick up a bit of language wherever I travel, basic phrases usually. But DAMN trying to nail the epiglottal sounds of even "Hyvää yötä" threw me!

I only got to see Helsinki, but it was a beautiful city. The Finnish people I met were lovely with a great dry sense of humor, and I would love to visit again someday.

Kippis

[-] kolmaskommentoija@sopuli.xyz 1 points 1 hour ago

I was just at a networking/research technology conference in Helsinki (TNC26) where the topic of nordic languages— especially minority ones—being under-represented by current automated transcription/translation tools came up in one of the side talks I attended. There’s some effort by various European NRENs and universities to train models on these languages so those tools can be more widely available to students, academics, and the public. The talk was about “Scribe” by SUNET (Swedish Research Network) hosting whisper models for this purpose.

That should be especially good for things like the multiple sapmi languages! At least in finnish you can already write in the proper "book language" and get pretty accurate translations, even though the dialects still escape that.

Also, Finnish is fucking hard lol. I can usually pick up a bit of language wherever I travel, basic phrases usually. But DAMN trying to nail the epiglottal sounds of even “Hyvää yötä” threw me!

It is usually especially hard to learn for indo-european speakers, so it is not just you struggling! Haha :)

[-] Gloomy@mander.xyz 2 points 13 hours ago

Well, it is always fun to play with the balls and the girls.

[-] TheBlackLounge@lemmy.zip 9 points 1 day ago

Our company decided to build our own ai translation system because the human translators we've been hiring started using AI... Quality dropped immensely, trust is lost. CEOs don't feel like shopping around. So sad.

[-] schmorpel@slrpnk.net 10 points 1 day ago

Here's the thing: human translators have been using 'AI' for over a decade. It used to be called machine translation, and for anything but the dumbest stuff it's a dumb idea, first nail in the coffin of translation. Translation agencies loved the shite, of course, because they could now pay a translator 0.05€ per word instead of 0.10€, arguing that now the same work took less time (it did, and also a lower quality translation was produced with a lot of costly bullshit software in the middle). The translators, as is to expect, hated it, but were forced to accept it or starve. We are now very slowly reaching the point where we are hired back as esteemed professionals after AI-caused communication mishaps and business fuckups keep piling up ...

[-] explodicle@sh.itjust.works 1 points 16 minutes ago

I've got a friend who used to translate for the U.N.

Sometimes you do not want to mess up that translation and it's worth any price.

[-] TheBlackLounge@lemmy.zip 5 points 23 hours ago

Imho nothing wrong with AI use by professionals, as long as it's verified. That obviously wasn't the case.

[-] Shameless@lemmy.world 25 points 1 day ago

I've heard of people leaving perplexity because the CTO is strongly encouraging devs to use AI vibe coding and not waste their time manually reviewing the code themselves. Sounds like a shit show.

[-] SW42@lemmy.world 14 points 1 day ago

[-] NM_Gringo@lemmy.world 8 points 1 day ago

How did they not see that coming? AI could have been a handy tool as a wingman handling small, repetitive tasks. Instead we get a giant mess that expensive and not terribly useful. To me it's like EVs. Which would have been great second around town cars until the infrastructure could catch up.

[-] kescusay@lemmy.world 19 points 1 day ago

Would have been? EVs are exactly that. They're great. Ones with range can easily replace cars with internal combustion engines for most use-cases. Usually costs me about $5 a month to keep mine charged.

Fully agree on LLMs being expensive messes that aren't very useful, though.

[-] Jason2357@lemmy.ca 5 points 1 day ago

Not the above poster, but I would say the cost. Modern EVs are designed to replace cars, and so cost the same or more, while being not quite as convenient for long trips.

We could have all had lightweight, city-speed but cheap, short-range EVs for a decade or two already if that was the approach taken. The battery requirements for 60kph and maybe 100km of range are super minimal, even before you go lighter. Like an order of magnitude smaller.

Might have worked if the street infra and laws allowed it. Would have been super tough to pull off at the start, and a lot of people lack the parking for two different vehicles. I do remember some companies trying these, but there's no where appropriate to drive them.

[-] btsax@reddthat.com 4 points 16 hours ago

I feel like you are describing the Nissan Leaf. Bought one used in like 2017 for $12k and it could only do like 80 miles on a charge but that's more than enough to get to work and back. Cost about $1 of electricity per 100 miles (apologies for freedom units)

[-] keenwillow12451@lemmy.1095.me 1 points 1 day ago

[removed by mod]

Technology

Our Rules

Approved Bots