Amazon- and Google-backed AI firm Anthropic says “general-purpose AI tools simply could not exist” if AI companies had to pay licences for the training material (www.computerweekly.com)

submitted 2 years ago by 0x815@feddit.de to c/technology@beehaw.org

57 comments fedilink hide all child comments

Generative artificial intelligence (GenAI) company Anthropic has claimed to a US court that using copyrighted content in large language model (LLM) training data counts as “fair use”, however.

Under US law, “fair use” permits the limited use of copyrighted material without permission, for purposes such as criticism, news reporting, teaching, and research.

In October 2023, a host of music publishers including Concord, Universal Music Group and ABKCO initiated legal action against the Amazon- and Google-backed generative AI firm Anthropic, demanding potentially millions in damages for the allegedly “systematic and widespread infringement of their copyrighted song lyrics”.

you are viewing a single comment's thread
view the rest of the comments

[-] SuiXi3D@kbin.social 64 points 2 years ago

…then maybe they shouldn’t exist. If you can’t pay the copyright holders what they’re owed for the license to use their materials for commercial use, then you can’t use ‘em that way without repercussions. Ask any YouTuber.

[-] Even_Adder@lemmy.dbzer0.com 11 points 2 years ago* (last edited 2 years ago)

You might want to read this article by Kit Walsh, a senior staff attorney at the EFF, and this one by Katherine Klosek, the director of information policy and federal relations at the Association of Research Libraries. YouTube's one-sided strike-happy system isn't the real world.

Headlines like these let people assume that it’s illegal, rather than educate them on their rights.

[-] SnotFlickerman 24 points 2 years ago

When Annas-Archive or Sci-Hub get treated the same as these giant corporations, I'll start giving a shit about the "fair use" argument.

When people pirate to better the world by increasing access to information, the whole world gets together to try to kick them off the internet.

When giant companies with enough money to make Solomon blush pirate to make more oodles of money and not improve access to information, it's "fAiR uSe."

Literally everyone knew from the start that books3 was all pirated and from ebooks with the DRM circumvented and removed. It was noted when it was created it was basically the entirety of private torrent tracker Bibliotik.

[-] Even_Adder@lemmy.dbzer0.com 10 points 2 years ago* (last edited 2 years ago)

AI training should not be a privilege of the mega-corporations. We already have the ability to train open source models, and organizations like Mozilla and LAION are working to make AI accessible to everyone. We can't allow the ultra-wealthy to monopolize a public technology by creating barriers that make it prohibitively expensive for regular people to keep up. Mega corporations already have a leg up with their own datasets and predatory terms of service that exploit our data. Don't do their dirty work for them.

Denying regular people access to a competitive, corporate-independent tool for creativity, education, entertainment, and social mobility, we condemn them to a far worse future, with fewer rights than we started with.

[-] SnotFlickerman 16 points 2 years ago* (last edited 2 years ago)

How am I doing their dirty work for them? I literally will stop thinking that they're getting away with piracy for profit when we stop haranguing people who are committing to piracy for the benefit of mankind.

I'm not saying Meta should be stopped, I'm saying the prosecution of Sci-Hub and Annas-Archive need to be stopped under the same pretenses.

If it's okay to pirate for the purpose of making money (what we put The Pirate Bay admins in jail for), then it's okay to pirate to benefit mankind.

There is literally no way in hell someone can convince me what Meta and others are doing is not pirating to use the data contained within to make money. What's good for the goose is good for the gander, as they say.

I reiterate, they knew it was pirated and had DRM circumvented when they downloaded it. There was zero question of the source of this data. They knew from the beginning they intended to profit from the use of this data. How is that different than what we accused The Pirate Bay admins of?

It really feels like "Well these corporations have money to steal more prolifically than little people, so since they're stealing is so big, we have to ignore it. They have lots of money and lawyers to fight us, The Pirate Bay didn't, nor do Sci-Hub or Annas-Archive, so let's just not try against those with money to fight us."

[-] Even_Adder@lemmy.dbzer0.com 5 points 2 years ago

Then I misunderstood what you were saying. Carry on.

[-] RandoCalrandian@kbin.social 2 points 2 years ago

Scraping Reddit for comments is not piracy, and that’s what most of these disputes are about.

It’s pretty disingenuous to claim otherwise, or that these ai tools are using the content differently than in the past.

This is all fearmongering as a negotiation tactic.

Whatever price creators decide they “deserve” will be entirely between organizations with a large enough lawyer pool to back it up, such as Reddit which didn’t make a damn piece of the content it’s currently trying to sell and claiming ownership of.

[-] VoterFrog@kbin.social 3 points 2 years ago

You don't see the difference between distributing someone else's content against their will and using their content for statistical analysis? There's a pretty clear difference between the two, especially as fair use is concerned.

[-] RandoCalrandian@kbin.social 1 points 2 years ago

That fair use argument also protects all of the small independent and often working for free developers that make FOSS models.

These arguments about retroactively applying copyright differently are a large public negotiation between massive moneymakers on what the cost of keeping the little guy out is, not something that will benefit any actual content creator.

[-] Zaktor@sopuli.xyz 3 points 2 years ago

By and large copyright infringement is illegal. That some things aren't infringement doesn't change that a general stance of "if I don't have permission, I can't copy it" is correct. The first argument in the EFF article is effectively the title: "it can't be copyright, because otherwise massive AI models would be impossible to build". That doesn't make it fair use, they just want it to become so.

[-] helenslunch@feddit.nl 4 points 2 years ago

I love seeing Lemmy users trip over themselves to declare that copyrights don't or shouldn't exist when it comes to pirating, right up until it comes to AI. Then Copyrights are enshrined by The Constitution and all the corporations NEED to pay for them, even when they're not actually copying anything.

[-] zaphod@lemmy.ca 10 points 2 years ago

You do realize that there may in fact be different, distinct groups of Lemmy users with differing, potentially non-overlapping beliefs, yeah?

[-] helenslunch@feddit.nl 3 points 2 years ago

Sure but Lemmy also operates as a sort of hivemind. This is the top-voted post in the last 24 hours and piracy content usually makes up at least 25% of content here.

[-] zaphod@lemmy.ca 8 points 2 years ago

Oh, well, you've clearly done the kind of deep and thoughtful analysis that would allow you to determine the general opinions of all Lemmy users. My mistake. Carry on.

[-] helenslunch@feddit.nl 3 points 2 years ago

Just simple observation

[-] SuiXi3D@kbin.social 7 points 2 years ago

Using copyrighted material for something you aren't gonna make any money off of? Cool, go hog wild. If you're gonna use some music or art that you didn't make in something that will make you money, the folks that made whatever you used should get a cut. Not the whole cut, but a cut.

[-] Moira_Mayhem@beehaw.org 2 points 2 years ago

If an artist falls in love with drawing and learns to draw from Jack Kirby's work and at the beginning even imitates his style, does he owe Jack Kirby royalties for every drawing he does as he 'learned' on Jack's copyrighted art?

[-] SuiXi3D@kbin.social 3 points 2 years ago

I think in that case, no. ‘Style’ is one thing, directly using someone’s art in your own work is something else entirely. However, we’re talking about a person here, not a program developed by a company for the express purpose of making as much money as possible in the shortest amount of time. Until AI can truly demonstrate that it is truly thinking and not simply executing commands given, I don’t think the lines are blurred nearly enough to suggest that someone learning to paint and an AI trained on hundreds of thousands of pieces of art for the purpose of making money for the company that built it are remotely the same.

[-] helenslunch@feddit.nl 2 points 2 years ago

Ah, moving the goal posts, I see.

[-] SuiXi3D@kbin.social 6 points 2 years ago

In what way? I rephrased my original comment.

[-] sneezycat@sopuli.xyz 4 points 2 years ago

And corporations want people to pay for it but they don't want to pay for it themselves. It's almost as if no one likes copyright, but it benefits some ppl more than others.

[-] Lowbird@beehaw.org 1 points 2 years ago

You do realize that lemmy contains very many users, many of whom disagree on any number of things. You are randomly assigning the opinions of lemmy's pirate users to a random commenter without evidence that they actually hold those opinions, because it'd be convenient for you if they're contradicting themself in any way (though the degree to which that would be a contradiction is also arguable). It's just a way of constructing a strawman instead of engaging with your interlocutor's actual words.

Also, part of the problem is that these LLMs very often do directly copy and spit out articles and random forum posts and etc word-for-word verbatim, or it'll do something that's the equivalent of a plagiarist who swaps a few words around in a sad attempt to not get caught. It becomes especially likely depending on how specific the search is, like if you look for a niche topic hardly anyone has written extensively on or for the solution to an esoteric problem that maybe just one person on a forum somewhere found an answer to. It also typically does not even give credit or link to its sources.

Plus, copyright law, if it exists, must apply to everyone, including major coporations. That's a separate issue than whether or not copyright law needs reform (it obviously does). If you wanna abolish copyright, fine, ok, get it abolished through the government. But while copyright law is still the law, I'm not ozk with giving magacorps a pass to break it legally, especially when they're more than happy to sue random, harmless individuals for violating their own copyrights. They want the law not to apply to them because they're rich.

The argument they're making is just ridiculous on its face when you compare it to other crimes. If AI should be allowed to violate copyright because otherwise it can't exist as it is, then anyone should be able to violate copyright because otherwise their cool projects won't be able to exist. And I should be able to rob a bank because otherwise I won't have all that money. You should be able to commit murder because otherwise your annoying coworker will keep bugging you. She should be able to walk out of a store with an iPhone without paying for it because otherwise she won't have an iPhone. Etc. It's an argument that says the criminal's motivations are legal justification for the crime. "You should let me legally do the thing because otherwise I can't do the thing" is just not a convincing argument in my book.

[-] helenslunch@feddit.nl 1 points 2 years ago

You do realize that lemmy contains very many users

Already addressed in another comment.

part of the problem is that these LLMs very often do directly copy and spit out articles and random forum posts and etc word-for-word verbatim

It's a problem they've acknowledged and are actively working on.

Plus, copyright law, if it exists, must apply to everyone, including major coporations.

Well many people here would disagree. That was the entire point of my comment.

this post was submitted on 29 Jan 2024

87 points (100.0% liked)

Technology

43268 readers

398 users here now

A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.

Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.

Subcommunities on Beehaw:

This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.

founded 4 years ago

MODERATORS

alyaza@beehaw.org

TheRtRevKaiser@beehaw.org

gyrfalcon@beehaw.org

rs5th@beehaw.org

coldredlight@beehaw.org

SemioticStandard@beehaw.org

TheRtRevKaiser@kbin.social

remington@beehaw.org