528
submitted 10 months ago by L4s@lemmy.world to c/technology@lemmy.world

‘Impossible’ to create AI tools like ChatGPT without copyrighted material, OpenAI says::Pressure grows on artificial intelligence firms over the content used to train their products

you are viewing a single comment's thread
view the rest of the comments
[-] hperrin@lemmy.world 15 points 10 months ago

It comes from OpenAI and is given to OpenAI’s users, so they are publishing it.

[-] linearchaos@lemmy.world 6 points 10 months ago

It's being mishmashed with a billion other documents just like to make a derivative work. It's not like open hours giving you a copy of Hitchhiker's Guide to the Galaxy.

[-] hperrin@lemmy.world 3 points 10 months ago

New York Times was able to have it return a complete NYT article, verbatim. That’s not derivative.

[-] Fraubush@lemm.ee 5 points 10 months ago

I thought the same thing until I read another perspective into it from Mike Masnick and, from what he writes, it seems pretty clear they manipulated ChatGPT with some very specific prompts that someone who doesn't already pay NYT for access would not be able to do. For example, feeding it 3 verbatim paragraphs from an article and asking it to generate the rest if you understand how these LLMs work, its really not surprising that you can indeed force it to do things like that but it's an extreme and I'm qith Masnick and the user your responding to on this one myself.

I also watched most of today's subcommittee hearing on AI and journalism. A lot of the arguments are that this will destroy local journalism. Look, strong local journalism is some of the most important work that is dying right now. But the grave was dug by these large media companies and hedge funds that bought up and gutted those local news orgs and not many people outside of the industry batted an eye while that was happening. This is a bit of a tangent but I don't exactly trust the giant headgefunds who gutted these local news journalists ocer the padt deacde to all of a sudden care at all about how important they are.

Sorry fir the tangent butbheres the article i mentioned thats more on topic - http://mediagazer.com/231228/p11#a231228p11

[-] hperrin@lemmy.world 3 points 10 months ago

So they gave it the 3 paragraphs that are available publicly, said continue, and it spat out the rest of the article that’s behind a paywall. That sure sounds like copyright infringement.

[-] linearchaos@lemmy.world 2 points 10 months ago

And that's not the intent of the service, it's a bug and they'll fix it.

this post was submitted on 09 Jan 2024
528 points (100.0% liked)

Technology

59253 readers
2118 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS