It's precisely ChatGPT's inability to link the sources that is the concern.
also, the fact that Google is so bad about this stuff should give us some pause about directly handing even more incredibly powerful, not-well-curated tools to a constituency as broad as "everyone with an internet connection". it's actually insanely bad this a problem with Google, we've just normalized it there and it sucks!
At the heart of the issue I think is the fact that GPT can trick enough people into believing that there's organized thought behind what it says. So people have started trusting and using AI in spaces that it doesn't belong. Some fields have been resistant, but when there are places that operate under the incentive of cheapest labor wins (lowest bid contracts, for example), AI as a whole has been infiltrating under the guise of capitalism in places it shouldn't currently (or perhaps ever) exist.
The hype cycle around AI right now is misleading. It isn't revolutionary because of these niche one-off use-cases, it's revolutionary because it's one AI that can do anything. Problem with that is what it's most useful for is boring for non-technical people.
Take the library I wrote to create "semantic functions" from natural language tasks - one of the examples I keep going to in order to demonstrate the usefulness is
@semantic
def list_people(text) -> list[str]:
'''List the people mentioned in the given text.'''
8 months ago, this would've been literally impossible. I could approximate it with thousands of lines of code using SpaCy and other NLP libraries to do NER, maybe a dictionary of known names with fuzzy matching, some heuristics to rule out city names or more advanced sentence structure parsing for false positives, but the result would be guaranteed to be worse for significantly more effort. Here, I just tell the AI to do it and it... does. Just like that. But you can't hype up an algorithm that does boring stuff like NLP, so people focus on the danger of AI (which is real, but laymen and news focus on the wrong things), how it's going to take everyone's jobs (it will, but that's a problem with our system which equates having a job to being allowed to live), how it's super-intelligent, etc. It's all the business logic and doing things that are hard to program but easy to describe that will really show off its power.
I'm both really excited and worried about the part where AI takes over so many jobs that enough people will be without work. I wonder how society will deal with that, will everyone get a proper base "salary" for existing or will there be huge refuge-like camps for the poor jobless people?
Under capitalism, i fear automationing will mean people losing their jobs only have worse, often dangerous opinions, that machines could do, meanwhile entertainment and the like will be flooded by even shittier quality AI made crap. I can only pray it will mean everyone's basic needs being covered, but that requires a huge shift
Yeah, I just don't see that happening. The whole "western" world is taking hard turn to the right, and that's not going to get better any time soon.
It's not so much the inability to link sources but the active laundering of those sources that bugs me. We've been lucky that shady information has largely had a vibe that's pretty easy to spot. ChatGPT presents everything with the same level of professionalism.
Worse, while we might collectively start discounting direct chatbot output because LLMs are dirty liars, scammers can now cheaply rewrite their typo-ridden weird ass screeds into something resembling professionally produced copy.
Often time scammers put a few typos and whatnot into initial contact to weed out smarter people. Mainly if the scam is going to involve phone calls or something. Scams just trying to get passwords or infect your computer might try harder to look legit.
I like to think of it from a different standpoint. Propaganda and fake news has existed for hundreds of years, if not millennia. It's just that in the past it was mostly created by wealthy folks, and now anyone can create their own.
You could even say, propaganda and fake news were the original form.
One enables the other, or rather the snake is constantly eating itself. SEO content and clickbait were already plagiarizing and consuming human communication, polluting the web by crowding out actual information -- ChatGPT and LLMs calcify and turbo-charge this. Tech companies are reacting by piling their own LLMs on top -- ingesting garbage and generating yet more garbage. Soon enough, appending " reddit" to our search terms will not be enough to quickly and freely get human information from the web.
Meanwhile -- laymen are being told that ChatGPT is an oracle, an intelligence, by companies and enthusiasts trying to build a crypto-style hype train. And the laymen are reacting accordingly. They are being told that ChatGPT knows everything. It doesn't even know what a pineapple is.
Yeah, the direction commercial AI took is truely disheartening... Like, AI is a useful tool, but it's been buisnesized where everyone puts AI is places where it shouldn't be. Mostly because people don't understand what they are doing so surely an AI model will...
The other day a dude wanted to dev an app with me about some random shit with an AI, except it could all be done with standard algorithms, and would probably perform much better too.
I looked at him and almost facepalmed on the spot...
Yeah, like that one in this thread who uses an LLM in Python to perform trivial tasks. They write a function with an LLM prompt as the body and then it gets executed by the LLM at runtime. Python was apparently not inefficient enough.
I don't know, if I need a trivial function I just code it. Then I know it works and performs in a decent time.
I mean, that use case is definitely cool, but there is no way this should ever be used in production code.
It's kinda like using dynamite as a novel heating system.
I think it's a combination of inability to link to sources (as you have stated) as well as the confidence in which it may provide incorrect information, and a lack of proper understanding from many people as to how LLMs work and exactly how incorrect they can be at times.
Sure, people can lie on the internet, but a chat bot talking to me and lying? Shouldn't computers not be able to do that? (/s of course)
It's not the inability to link sources, it's the wholesale manufacture of them. It's a language model, not a search engine. It doesn't get its information from somewhere. It generates it probibalistically based on the structure of the sentence its forming.
It'll include sources if the sentence structure suggests they should be there, but they'll also just be built by probabilistic insertion of words.
It'll include sources if the sentence structure suggests they should be there, but they'll also just be built by probabilistic insertion of words.
I've seen attempts of people trying to train a LLM on information with sources. The end result was a model that would still hallucinate false information, and follow it up with a convincing looking source that doesn't actually exist or a link that just leads to a 404 page. The way current LLMs work makes it impossible for them to mention accurate sources by default as they don't remember full sentences or even any actual information, but just pick up some underlying patterns.
Currently the best you can do is letting a LLM come up with search engine queries to find relevant and up to date information for a certain question, and then making it formulate an answer based on what it found and including links to the page(s) it used. The main problem here is that LLMs are not great yet at verifying if a source is accurate, and most people will just take anything that mentions a source as a hard fact without even looking at what the source is.
The issue is that LLMs are fundamentally not able to not know something. Non-LLM filters that are strapped in front of an LLM can catch stuff like that ("As an LLM I am not able to..."), but if the request makes it through the filter, the LLM is not able to say "Sorry, I don't know that", because the data set doesn't contain that.
For example, there aren't a lot of API documentations that contain a "Sorry, I don't know how this endpoint works".
Strongly agreed. I view this as the biggest issue with LLMs. They will hallucinate a confidently incorrect answer for those cases. It makes them misinformation machines.
The problem with chatGPT is that it allows for automation of content creation.
Imagine a a single guy using chatGPT to control thousands of social media bots, who answer in a human-like way and are able to follow conversations and context, but who all defend the same point of view.
Or imagine a single guy controlling thousands of "local news blogs" that have a constant stream of fresh AI-generated content (both articles and comments), once again all pushing the same narrative.
That is the main problem with things like chatGPT, if not controlled - they allow anyone to create their own "troll farm".
I just wish humans could be not awful for once in our history. You know what I've done with ChatGPT? I had it help me convert a big python function into a one-line delta, and got it to write a short horror story about a man eating a can of beans in theater, amongst other silly things. But everyone suffers, and things get harder to make use of because of power-mad scumbags who see everything as a means to gain control of others.
Hell is other people
I had it write a rap about the poop emoji
I mostly use it for silly things or to give me ideas, and I might try it if we get back into running a game in foundry vtt because the macro I had it test generate messing around looked like it should work fine and I'm lazy
But I can't imagine using it for serious stuff ever. Maybe in a few years.
I think the difference is Google is just linking you to this content. On the other hand ChatGPT is pretty confidently telling people all these things. Add the fact that a lot of people consider ChatGPT to be some sort of all-knowing entity and it’s a recipe for disaster.
There's something that worries me about GPT-like technologies, and I see very few people talking about it: GPT-based social media bots.
It can give people and groups to create much advanced mass manipulation strategies. Imagine a lot of gpt accounts on all sites creating comments advocating pro or against something, every time it's mentioned, in a very natural language, that can fool most people.
It worries me a lot, and I'm sure it will be done at some point. If recent elections around the world were a mess due to a lot of social media manipulation and fake news campaigns, now imagine that powered by gpt.
I was gonna reply to this in the style of ChatGPT, but I somehow feel like that'd be the same as joking about having a bomb at airport security. But yeah, this is my main concern as well. Not only social media, but even blogs and reputable-looking websites which can act as "sources". And what about Wikipedia bots?
I'm not worried about the loss of jobs or the sentience of computers, but rather the incapability to discern what's real and what's not. Could online human certificates be a thing? Multi-factor authentication (that is somehow still anonymous)?
I don't know. Social media bots have been doing exactly that quite well for a long time. Turns out, you don't actually have to write a comment, you just need to find another one that talks about the same key words and copy it in.
You still get great natural language (since it is natural language) and it fools most people as well.
Political talking points aren't that varied. There are a handful of different takes on each topic and people repeat them already, so just copying them doesn't make much of a difference.
It's not the same. GPT-based bots add much more to the situation.
Current bots are easily identifiable, and can be just banned when spotted, but gpt bots can interact in a way that makes is more difficult to spot. They can be programmed to present different personalities and tastes, commenting on several places, and even chit-chatting here and there. Then, they will do their propaganda, considering the contexts, arguing and replying to counterarguments.
It's a much more complex structure, and much harder to identify. Today, gpt produces text following some patterns, but that's something that can be improved.
While the inability to source is a huge problem, but you also have to keep in mind that complaining about AI has other objective beyond the obvious "AI bad".
- it's marketing: "Our thing is so powerful it could irreparably change someone's life" is still advertising even if that irreparable change is bad. Saying "AI so powerful it's dangerous" just sounds less advertis-y than "AI so powerful you cannot not invest in it" despite both leading to similar conclusions (you can look back at the "fearvertising" done during the original AI boom: same paint, different color)
- it's begging for regulatory zeals to be put into place: Everyone with a couple of millions can build an LLM from scratch. That might sound like a lot, but it's only getting cheaper and it doesn't need highly intricate systems to replicate. Specifically the ability to finetune a large model with few datapoints allows even open-source non-profits like OpenAssistant to compete against the likes of google and openai: Google has made that very explicit in their leaked We have no moat memo. This is why you see people like Sam Altman talking to congress about the dangers of AI: He has no serious competetive advantage and hopes that with sufficient fear-mongering he can get the government to give him one.
Complaining about AI is as much about the AI as it is about the economical incentives behind AI.
A friend of mine uses it to re-type emails to sound more professional. He even got a couple of others to start doing it at his workplace. A few people have started to notice one particular employee has suddenly completely changed how he talks in emails. It's very amusing, but it works extremely well for my friend.
He even pays the $20usd/m for the "premium" or whatever version. He's a C-Suite at the company so it's nothing to him to pay for the service. Other than instances like that, or simple coding (hey I need a quick bs landing page, or I need this added to whatever) it's pretty overblown for how people seem to think it works.
I have to copy a lot of text from a pdf but the returns are inserted in weird places.
I used to do this whole workaround in word.
Now I’m just like, “chatgpt, can you remove all the extra returns from the text I send you?” “Sure no problem”
It takes me like 5 minutes instead of 20 per document.
Dach step drastically lowers the barriers to get to that end, and also distorts the monetary incentives for people operating that technology to deceive you.
It used to be you have to go to a bookstore/library to read some crack theory on the ailment of your choice. In order to get your crack theories published, you had gatekeepers, publishers, bookstore owners, and then librarians, who would choose what books to stock. Pretty hard to abuse your powers to deceive people.
Then it became easier because you can do it on a desktop with google. Then it became easier because you can just ask google on your phone. Now if you can get a solid SEO page, you're not gatekept by anyone.
Now it's easier since you have an authoritative AI that tells you exactly how to do it, and in theory, you can freely develop these AIs to give answers that match your "version" of reality in order to get the most engagement and money out of you. Imagine you go to a website and some "doctor" chats you up that's actually just a conversational AI, and it just persuades you via pseudo-scientific language that is targeted towards your personal preferences just to get you to buy their snake oil.
Previously making misinformation, propaganda, spam etc even if using Google was still a manual activity bound by human limitations, now you can have a fully autonomous scam bot that will relatively cheaply scale to infinity
Scam call centers right now are hugely successful unfortunately, and they're limited by human beings manning the phones, imagine a fleet of gpt agents scamming old ladies out of their life savings at record efficiency
I feel like it's the occasional unpredictability that people are scared of. Whether it's people being unable to tell if something was created by ChatGPT, if it's pulling false sources, or people finding ways around set limitations and filters.
I used it to help a friend with a cover letter for a job. I pasted in what my friend had written and asked if it could make it sound better. It literally just made up stuff to make it sound better.
You should try perplexity.ai as it provides sources & even has filters too. I prefer the academic filter, plus I like checking the provided sources to ensure the AI connected the various info together correctly.
Edit: Perplexity is also updated throughout the day while ChatGPT only has info from 2021 & prior.
I dunno.
Every history book I've ever read seems to indicate that our species tends toward using new technology for the worst purposes.
Sure, but many purposes that people are afraid that ChatGPT could be used for are already 1:1 possible with Google.
Reminds me of the 3D printed gun discussion. Sure, you can use a 3D printer to print a super crappy gun that can't build up enough pressure for the projectile to gain some actual lethal speed without exploding.
Or you could do what people have been doing for centuries and just get a pipe and some other hardware from the hardware store and build a gun manually. That's literally how guns where made for centuries. (Except that you have to replace the big box hardware store with the correct equivalent of the time.)
Ebay has rolled out some kind of language model plugin for their listing app. It will generate a short product description for you. I've been using chat gpt to do this for awhile and also asking it to optimize titles within the 80 character limit.
I can't always think of related words that other people might search for so this helps me a lot.
I get chatgpt prompts in every search on bing and specifically for a troubleshoot in Linux. It is garbage, and I skip it completely.
It's only good for stringing together nonsense, say for meeting minutes to pretend the meeting actually did something useful.
That's the thing: LLMs are great for any content where quality and accuracy doesn't matter.
- Stock photos
- Product descriptions
- Meeting notes
- Summaries of stuff that's so boring and useless that you can't be bothered to read the long form
Then again, if the conteent doesn't matter, why even create it?
brave's summarizer AI is pretty good.
It’s the “unknown unknowns” that are really scary, beyond the short term ability it’s easy to understand now. All the ability extrapolated, layered and multiplied.
I like using it for quick simple codes like having it make me something that can rename files.
Technology
A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.
Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.
Subcommunities on Beehaw:
This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.