329

Data poisoning: how artists are sabotaging AI to take revenge on image generators (theconversation.com)

submitted 2 years ago by L4s@lemmy.world to c/technology@lemmy.world

100 comments fedilink hide all child comments

Data poisoning: how artists are sabotaging AI to take revenge on image generators::As AI developers indiscriminately suck up online content to train their models, artists are seeking ways to fight back.

top 50 comments

sorted by: hot top controversial new old

[-] gaiussabinus@lemmy.world 72 points 2 years ago

This system runs on the assumption that A) massive generalized scraping is still required B) You maintain the metadata of the original image C) No transformation has occurred to the poisoned picture prior to training(Stable diffusion is 512x512). Nowhere in the linked paper did they say they had conditioned the poisoned data to conform to the data set. This appears to be a case of fighting the last war.

[-] sukhmel@programming.dev 16 points 2 years ago

It is likely a typo, but "last AI war" sounds ominous 😅

[-] Blaster_M@lemmy.world 62 points 2 years ago

Takes image, applies antialiasing and resize

Oh, look at that, defeated by the completely normal process of preparing the image for training

[-] qooqie@lemmy.world 45 points 2 years ago

Unfortunately for them there’s a lot of jobs dedicated to cleaning data so I’m not sure if this would even be effective. Plus there’s an overwhelming amount of data that isn’t “poisoned” so it would just get drowned out if never caught

[-] Potatos_are_not_friends@lemmy.world 31 points 2 years ago

Imagine if writers did the same things by writing gibberish.

At some point, it becomes pretty easy to devalue that content and create other systems to filter it.

[-] scorpious@lemmy.world 9 points 2 years ago

if writers did the same things by writing gibberish.

Aka, “X”

load more comments (3 replies)

[-] zwaetschgeraeuber@lemmy.world 28 points 2 years ago

nightshade and glaze never worked. its scam lol

[-] kromem@lemmy.world 17 points 2 years ago

Shhhhh.

Let them keep doing the modern equivalent of "I do not consent for my MySpace profile to be used for anything" disclaimers.

It keeps them busy on meaningless crap that isn't actually doing anything but makes them feel better.

[-] drmoose@lemmy.world 25 points 2 years ago

Just don't out your art to public if you don't want someone/thing learn from it. The clinging to relevance and this pompous self importance is so cringe. So replacing blue collar work is ok but some shitty drawings somehow have higher ethical value?

[-] Catoblepas 33 points 2 years ago

"Just don't make a living with your art if you aren't okay with AI venture capitalists using it to train their plagiarism machines without getting permission from you or compensating you in any way!"

If y'all hate artists so much then only interact with AI content and see how much you enjoy it. 🤷‍♂️

[-] drmoose@lemmy.world 14 points 2 years ago

It has nothing to do with AI venture capitalists. Also not every profession is entitled to income, some are fine to remain as primarily hobbies.

AI art is replacing corporate art which is not something we should be worried about. Less people working on that drivel is a net good for humanity. If can get billions of hours wasted on designing ads towards real meaningful contributions we should added billions extra hours to our actual productivity. That is good.

[-] FlyingSquid@lemmy.world 12 points 2 years ago

Also not every profession is entitled to income

Yes it is. Otherwise it is not a profession. People go to school for years to become professional artists. They are absolutely entitled to income.

But hey, you want your murals painted by robots and your wall art printed out, have fun. I'm not interested in your brave new world.

load more comments (8 replies)

[-] Catoblepas 10 points 2 years ago

The ratio of using AI to replace ad art:fraud/plagiarism has to be somewhere around 1:1000.

“Actual productivity” is a nonsense term when it comes to art. Why is this less “meaningful” than this?

Without checking the source, can you even tell which one is art for an ad and which isn’t?

[-] drmoose@lemmy.world 7 points 2 years ago

I'm not sure what's your point here? Majority of art is drivel. Most art is produced for marketing. Literally. If that can be automated away what are we losing here? McDonald's logos? Not everything needs to be a career.

[-] TrickDacy@lemmy.world 8 points 2 years ago

What a shitty shitty shitty take

load more comments (6 replies)

[-] sukhmel@programming.dev 6 points 2 years ago

I would assume the first to be an ad, because most of depicted people look happy

[-] cm0002@lemmy.world 10 points 2 years ago

using it to train their plagiarism machines

That's simply not how AI works, if you look inside the models after training, you will not see a shred of the original training data. Just a bunch of numbers and weights.

[-] Catoblepas 6 points 2 years ago

If the individual images are so unimportant then it won't be a problem to only train it on images you have the rights to.

[-] Astarii_Tyler@lemmy.world 7 points 2 years ago

They do have the rights because this falls under fair use, It doesn't matter if a picture is copyrighted as long as the outcome is transformative.

[-] Catoblepas 6 points 2 years ago

I'm sure you know something the Valve lawyers don't.

load more comments (1 replies)

load more comments (3 replies)

[-] Red_October@lemmy.world 15 points 2 years ago

The idea that you would actually object to replacing labor with automation, but think replacing art with automation is fine, is genuinely baffling.

[-] drmoose@lemmy.world 12 points 2 years ago

Except the "art" ai is replacing is labor. This snobby ridiculous bullshit that some corporate drawings are somehow more important than other things is super cringe.

[-] cm0002@lemmy.world 9 points 2 years ago

Right, if you post publicly, expect it to be used publicly

[-] Ilovethebomb@lemm.ee 17 points 2 years ago

Yeah, no. There's a difference between posting your work for someone to enjoy, and posting it to be used in a commercial enterprise with no recompense to you.

[-] drmoose@lemmy.world 6 points 2 years ago

How are you going to stop that lol it's ridiculous. Would you stop a corporate suit from viewing your painting because they might learn how to make a similar one? It's makes absolutely zero sense and I can't believe delulus online are failing to comprehend such simple concept of "computers being able to learn".

[-] yuki2501@lemmy.world 9 points 2 years ago

Ah yes, just because lockpickers can enter a house suddenly everyone's allowed to break and enter. 🙄

[-] drmoose@lemmy.world 8 points 2 years ago

What a terrible analogy for learning 🙄

[-] BURN@lemmy.world 6 points 2 years ago

It’s not learning

load more comments (2 replies)

load more comments (11 replies)

load more comments (5 replies)

[-] FlyingSquid@lemmy.world 8 points 2 years ago

Are you actually suggesting that if I post a drawing of a dog, Disney should be allowed to use it in a movie and not compensate me?

[-] Delta_V@midwest.social 6 points 2 years ago

Everyone should be assumed to be able to look at it, learn from it, and add your style to their artistic toolbox. That's an intrinsic property of all art. When you put it on display, don't be surprised or outraged when people or AIs look at it.

[-] BURN@lemmy.world 6 points 2 years ago

AI does not learn and transform something like a human does. I have no problem with human artists taking inspiration, I do have a problem with art being reduced to a soulless generation that requires stealing real artists work to create something that isn’t original.

load more comments (2 replies)

load more comments (5 replies)

[-] Kushia@lemmy.ml 23 points 2 years ago

Artists and writers should be entitled to compensation for using their works to train these models, just like any other commercial use would. But, you know, strict, brutal free-market capitalism for us, not the mega corps who are using it because "AI".

[-] HejMedDig@feddit.dk 16 points 2 years ago

Let's see how long before someone figures out how to poison, so it returns NSFW Images

[-] daxnx01@lemmy.world 6 points 2 years ago

You can create NSFW ai images already though?

Or did you mean, when poisoned data is used a NSFW image is created instead of the expected image?

[-] HejMedDig@feddit.dk 8 points 2 years ago

Definitely the last one!

load more comments (5 replies)

[-] kromem@lemmy.world 15 points 2 years ago* (last edited 2 years ago)

This doesn't actually work. It doesn't even need ingestion to do anything special to avoid.

Let's say you draw cartoon pictures of cats.

And your friend draws pointillist images of cats.

If you and your friend don't coordinate, it's possible you'll bias your cat images to look like dogs in the data but your friend will bias their images to look like horses.

Now each of your biasing efforts become noise and not signal.

Then you need to consider if you are also biasing 'cartoon' and 'pointillism' attributes as well, and need to coordinate with the majority of other people making cartoon or pointillist images.

When you consider the number of different attributes that need to be biased for a given image and the compounding number of coordinations that would need to be made at scale to be effective, this is just a nonsense initiative that was an interesting research paper in lab conditions but is the equivalent of a mouse model or in vitro cancer cure being taken up by naturopaths as if it's going to work in humans.

[-] RagingRobot@lemmy.world 12 points 2 years ago

So it sounds like they are taking the image data and altering it to get this to work and the image still looks the same just the data is different. So, couldn't the ai companies take screenshots of the image to get around this?

[-] TheUncannyObserver@lemmy.dbzer0.com 17 points 2 years ago

Not even that, they can run the training dataset through a bulk image processor to undo it, because the way these things work makes them trivial to reverse. Anybody at home could undo this with GIMP and a second or two.

In other words, this is snake oil.

[-] neurogenesis@lemmy.dbzer0.com 11 points 2 years ago

Fighting the uphill battle to irrelevance ..

[-] uriel238 10 points 2 years ago* (last edited 2 years ago)

The general term for this is adversarial input, and we've seen published reports about it since 2011 when ot was considered a threat if CSAM could be overlayed with secondary images so they weren't recognized by Google image filters or CSAM image trackers. If Apple went through with their plan to scan private iCloud accounts for CSAM we may have seen this development.

So far (AFAIK) we've not seen adversarial overlays on CSAM though in China the technique is used to deter trackng by facial recognition. Images on social media are overlaid by human rights activists / mischief-makers so that social media pics fail to match secirity footage.

The thing is like an invisible watermark, these processes are easy to detect (and reverse) once users are aware they're a thing. So if a generative AI project is aware that some images may be poisoned, it's just a matter of adding a detection and removal process to the pathway from candidate image to training database.

Similarly, once enough people start poisoning their social media images, the data scrapers will start scaning and removing overlays even before the database sets are sold to law enforcement and commercial interests.

[-] Kedly@lemm.ee 9 points 2 years ago

Man, whenever I start getting tired by the amount of Tankies on Lemmy, the linux users and decent AI takes in users rejuvenates me. The rest of the internet has jumped full throttle on the AI hate train

[-] BURN@lemmy.world 14 points 2 years ago

The “AI hate train” is people who dislike being replaced by machines, forcing us further into the capitalist machine rather than enabling anyone to have a better life

[-] fruitycoder@sh.itjust.works 12 points 2 years ago

No disagreement, but it's like hating water because the capitalist machine used to run water mills. It's a tool, what we hate is the system and players working to entrench themselves and it. Should we be concerned about the people affected? Yes, of course, we always should have been, even before it was the "creative class" and white collar workers at risk. We should have been concerned when it was blue collar workers being automated or replaced by workers in areas with repressive regimes. We should have been concerned when it was service workers being increasingly turned into replaceable cogs.

We should do something, but people are titling at windmills instead of the systems that oppress people. We should be pushing for these things to be public goods (open source like stability is aiming for, distributed and small models like Petals.dev and TinyML). We should be pushing for unions to prevent the further separation of workers from the fruits of their labor (look at the Writer's Guild's demands during their strike). We should be trying to only deal with worker and community cooperatives so that innovations benefit workers and the community instead of being used against them. And much more! It's a lot, but it's why I get mad about people wasting their time being made AI tools exist and raging against them instead of actually doing things to improve the root issues.

[-] neurogenesis@lemmy.dbzer0.com 5 points 2 years ago

The "AI hate train" runs on fear and skips stops for reason, headed for a fictional destination.

load more comments (2 replies)

[-] Sabin10@lemmy.world 9 points 2 years ago

Data poisoning isn't limited to just AI stuff and you should be doing it at every opportunity.

load more comments (1 replies)

[-] yuki2501@lemmy.world 7 points 2 years ago

Feeding garbage to the garbage. How fitting.

load more comments

this post was submitted on 18 Dec 2023

329 points (100.0% liked)

Technology

76257 readers

2643 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

L3s@lemmy.world

enu@lemmy.world

technopagan@lemmy.world

L4s@lemmy.world

L3s@hackingne.ws

L4s@hackingne.ws