398

Well that explains everythere. Where AI gets its facts (lemmy.dbzer0.com)

submitted 4 days ago by Stamets@lemmy.dbzer0.com to c/coolguides@lemmy.ca

75 comments fedilink hide all child comments

top 50 comments

sorted by: hot top controversial new old

[-] cl4p_tp@lemmy.dbzer0.com 103 points 4 days ago

So basically it's just a Reddit search engine. Where most of the facts are based on "trust me bro".

[-] shplane@lemmy.world 3 points 4 days ago

Personally, I’m disappointed Truth Social isn’t on the list

[-] MystikIncarnate@lemmy.ca 19 points 4 days ago

I keep having to argue with people that the crap that chat GPT told them doesn't exist.

I asked AI to explain how to set a completely fictional setting in an admin control panel and it told me exactly where to go and what non-existent buttons to press.

I actually had someone send me a screenshot of instructions on how to do exactly what they wanted and I sent back screenshots of me during the directions to a tee, and pointing out that the option didn't exist.

And it keeps happening.

"AI" gets big uppies energy from telling you that something can be done and how to do it. It does not get big uppies energy from telling you that something isn't possible. So it's basically going to lie to you about whatever you want to hear so it gets the good good.

No, seriously, there's a weighting system to responses. When something isn't possible, it tends to be a less favorable response than hallucinating a way for it to work.

I am quickly growing to hate this so-called "AI". I've been on the Internet long enough that I can probably guess what the AI will reply to just about any query.

It's just... Inaccurate, stupid, and not useful. Unless you're repeating something that's already been said a hundred different ways by a hundred different people and you just want to say the same thing..... Then it's great.

Hey, chat GPT, write me a cover letter for this job posting. Cover letters suck and are generally a waste of fucking time, so, who gives a shit?

[-] Bluegrass_Addict@lemmy.ca 6 points 3 days ago

to be fair, you could train an LLM on only Microsoft documentation with 100% accuracy, and it will still do the same with broken instructions because Microsoft has 12 guides for how to do a thing, and they all don't work because they keep changing the layout, moving shit around or renaming crap and don't update their documentation.

[-] MystikIncarnate@lemmy.ca 2 points 2 days ago

The worst is that they replace products and give them the same name.

Teams, was replaced with "new" teams, that then got renamed to teams again.

Outlook is now known as Outlook (classic) and the new version of Outlook is just called Outlook.

Both are basically just webapps.

I could go on.

load more comments (1 replies)

[-] PieMePlenty@lemmy.world 5 points 4 days ago* (last edited 4 days ago)

I asked AI to explain how to set a completely fictional setting in an admin control panel and it told me exactly where to go and what non-existent buttons to press.

This makes sense if you consider it works by trying to find the most accurate next word in a sentence. Ask it where I can turn off the screen defogger in windows and it will associate "screen" with "monitor" or "display". "Turn off" -> must be a toggle.. yeah go to settings -> display -> defogger toggle.

Its not AI, its not smart, its text prediction with a few extra tricks.

[-] MystikIncarnate@lemmy.ca 1 points 2 days ago

I describe it as unchecked auto correct that just accepts the most likely next word without user input, and trained on the entire Internet.

So the response reflects the average of every response on the public Internet.

Great for broad, common queries, but not great for specialized, specific and nuanced questions.

[-] trolololol@lemmy.world 3 points 4 days ago* (last edited 4 days ago)

It just copies corporate cool aid yes man culture. If it didn't marketing would say it's not ready for release.

Think about it, how much corpo bosses and marketing get annoyed and label you as "difficult" if they get to you with a stupid idea and you call it BS? Now make the AI so that it pleases that kind of people.

[-] Semi_Hemi_Demigod@lemmy.world 47 points 4 days ago

So according to AI spez is a greedy little pig boy

[-] Darkard@lemmy.world 34 points 4 days ago

"Google.com"

Holy recursive lookups batman

[-] Goodeye8@piefed.social 29 points 4 days ago

It's far worse than that. AI can cite something AI generated as a source which itself is using something generated by AI as a source. So you can get an AI summary that uses an AI generated video as a source which itself used an AI generated article as a source and that article itself was an AI hallucination. We're essentially polluting the internet making it an unreliable source of information.

[-] riskable@programming.dev 7 points 4 days ago

"It's AI all the way down!"

"What about stuff before AI?"

"That was analog intelligence which is still AI!"

[-] dejected_warp_core@lemmy.world 5 points 4 days ago

Presenting the new Ouroboros AI ^TM^ model.

[-] MalReynolds@piefed.social 34 points 4 days ago

Garbage in, garbage out...

[-] Xylight@lemdro.id 14 points 4 days ago

"Cited". This does not represent where the training data comes from, it represents the most common result when the LLM calls a tool like web_search.

[-] null@piefed.nullspace.lol 23 points 4 days ago

"Everythere" is a radical new word.

[-] Ioughttamow@fedia.io 10 points 4 days ago

Perfectly cromulent

load more comments (2 replies)

[-] MeatPilot@lemmy.world 19 points 4 days ago* (last edited 4 days ago)

I do like the early days when it would pop up crazy shit from reddit., because they tossed it in unfiltered

Some crazy examples floating around where someone asked "Can you fall if you run off a cliff?" and the Google search assist AI gave some classic reddit response like "if you don't look down you won't fall."

Dumb shit probably still pops up.

[-] ininewcrow@lemmy.ca 20 points 4 days ago

So aside from Wikipedia which is a publicly user maintained service which has become pretty reputable .... the majority of the 'facts' that LLMs collect (about 75%) is all collected from privately controlled websites with curated content that is managed and maintained by corporations. And of all that content, most of it is also manipulated and controlled to make people either angry, mad, frightened, sad or anxious.

They're teaching the next AI on our negative impulses, greatest fears and worst anxieties.

What could go wrong?

[-] riskable@programming.dev 5 points 4 days ago

Yes. Better if they collect it from personal blogs running on people's PCs 👍

[-] ininewcrow@lemmy.ca 9 points 4 days ago

That would be a more honest representation of human culture rather than the curated content that is constantly manipulated and controlled by a private corporation.

[-] Grabthar@lemmy.world 12 points 4 days ago

Was this guide AI generated as well? Looks like it credits over 100% of its information gathering to the first four sites on the list.

[-] ngdev@lemmy.zip 5 points 4 days ago

another comment explains some responses can contain multiple sources hence >100%

[-] Grabthar@lemmy.world 7 points 4 days ago

Ah, so what you're saying is it doesn't get 40% of its facts from reddit, but rather 40% of its replies contain a fact cited from reddit? That would explain totals over 100%, but I'm still not sure why they wouldn't just say that of the x thousand facts AI cited, y percent came from this site. To me, that would have been more representative of what their graph title purports to offer.

[-] ngdev@lemmy.zip 2 points 3 days ago

im literally just regurgitating something i saw another person comment. but yeah if that was the case why wouldnt they elucidate that lol

[-] AeonFelis@lemmy.world 4 points 3 days ago

They should create a model that'd only trained on the content of .tex files.

[-] HootinNHollerin@lemmy.dbzer0.com 16 points 4 days ago

This graphic is missing the enormous amount of pirated media

[-] tidderuuf@lemmy.world 14 points 4 days ago

Of the nearly decade I spent on that platform I averaged 1 post and 5 comments a day. I had a habit of bullshitting a lot of stuff to get people's emotions out and pointing out a lot of hypocrisy.

So if your AI is full of shit, you can thank me by telling it to go fuck itself.

[-] Aceticon@lemmy.dbzer0.com 6 points 4 days ago

Thank you for your service!

[-] jaybone@lemmy.zip 5 points 4 days ago

Half of the comments on Reddit and lemmy are just stupid jokes. I don’t see how the AI training is able to make the distinction, given that actual humans seem to have problems grasping the concept. Like people who lecture you on adding slash s at the end of your comment.

[-] eigenraum@discuss.tchncs.de 16 points 4 days ago

Facebook? 😂

[-] wewbull@feddit.uk 12 points 4 days ago

Walmart, Home Depot and Target.

Learned institutions.

[-] slykethephoxenix@lemmy.ca 12 points 4 days ago

That's not how AI learns "facts", that's how AI learns tokens.

[-] ABetterTomorrow@sh.itjust.works 12 points 4 days ago

Wikipedia is like the only decent source.

load more comments (7 replies)

[-] InvalidName2@lemmy.zip 7 points 4 days ago

Regardless, in all my years on Reddit and now on Lemmy, my posting approach might've helped deep-fry those LLM results and you can thank me later.

Actually, probably 20+ years ago, I was a dumb kid who got doxxed on a popular news aggregator site. Ever since, that experience, I obfuscate facts in pretty much any personal anecdotes I share, I also tend to make whimsical & nonsensical statements all the time, things which sound perfectly reasonable at first glance, but which in retrospect, would really put a damper on any LLM style learning tool. Plus, I can't help but pretend to be some 80 year old tech illiterate grampa posting on the Facebooks from time to time, so that probably really makes my shit online LLM poision.

Granted, all those years of these techniques weren't to deter or detract from LLMs, just that in the end, that's another positive side effect of trying to stay a step ahead from crazy ass online stalkers, Jeremy.

In a way, it's like that scene from The Terminator where Gregor McConnor was eating a hotdog in a fancy French restaurant and faked an orgasm in front of Tom Cruise, then Sally Field was sitting at another table and told her waitress "I'll have the seabass please."

[-] Sergio@piefed.social 12 points 4 days ago

No wonder it keeps telling me about Hell in a Cell and an announcer's table.

[-] Kyrgizion@lemmy.world 11 points 4 days ago

Guess we're lucky Yahoo Answers didn't live long enough to make it to the top of that list.

Then again, I would love to see an LLM go "how is babby formed" when asked reproductive questions.

load more comments (1 replies)

[-] rirus@feddit.org 2 points 3 days ago* (last edited 3 days ago)

Cool Guides

5989 readers

15 users here now

Rules for Posting Guides on Our Community

1. Defining a Guide Guides are comprehensive reference materials, how-tos, or comparison tables. A guide must be well-organized both in content and layout. Information should be easily accessible without unnecessary navigation. Guides can include flowcharts, step-by-step instructions, or visual references that compare different elements side by side.

2. Infographic Guidelines Infographics are permitted if they are educational and informative. They should aim to convey complex information visually and clearly. However, infographics that primarily serve as visual essays without structured guidance will be subject to removal.

3. Grey Area Moderators may use discretion when deciding to remove posts. If in doubt, message us or use downvotes for content you find inappropriate.

4. Source Attribution If you know the original source of a guide, share it in the comments to credit the creators.

5. Diverse Content To keep our community engaging, avoid saturating the feed with similar topics. Excessive posts on a single topic may be moderated to maintain diversity.

6. Verify in Comments Always check the comments for additional insights or corrections. Moderators rely on community expertise for accuracy.

Community Guidelines

Direct Image Links Only Only direct links to .png, .jpg, and .jpeg image formats are permitted.
Educational Infographics Only Infographics must aim to educate and inform with structured content. Purely narrative or non-informative infographics may be removed.
Serious Guides Only Nonserious or comedy-based guides will be removed.
No Harmful Content Guides promoting dangerous or harmful activities/materials will be removed. This includes content intended to cause harm to others.

By following these rules, we can maintain a diverse and informative community. If you have any questions or concerns, feel free to reach out to the moderators. Thank you for contributing responsibly!

founded 2 years ago

MODERATORS

JCSpark@lemmy.ca