this post was submitted on 23 Oct 2023

521 points (86.1% liked)

Technology

63010 readers

3534 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related content.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, to ask if your bot can be added please contact us.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

L3s@lemmy.world

enu@lemmy.world

technopagan@lemmy.world

L4s@lemmy.world

521

This new data poisoning tool lets artists fight back against generative AI (www.technologyreview.com)

submitted 1 year ago by ElectroVagrant@lemmy.world to c/technology@lemmy.world

111 comments fedilink hide all child comments

A new tool lets artists add invisible changes to the pixels in their art before they upload it online so that if it’s scraped into an AI training set, it can cause the resulting model to break in chaotic and unpredictable ways.

The tool, called Nightshade, is intended as a way to fight back against AI companies that use artists’ work to train their models without the creator’s permission.
[...]
Zhao’s team also developed Glaze, a tool that allows artists to “mask” their own personal style to prevent it from being scraped by AI companies. It works in a similar way to Nightshade: by changing the pixels of images in subtle ways that are invisible to the human eye but manipulate machine-learning models to interpret the image as something different from what it actually shows.

top 50 comments

sorted by: hot top controversial new old

[–] MargotRobbie@lemmy.world 302 points 1 year ago (3 children)

It's made by Ben Zhao? You mean the "anti AI plagerism" UChicago professor who illegally stole GPLv3 code from an open source program called DiffusionBee for his proprietary Glaze software (reddit link), and when pressed, only released the code for the "front end" while still being in violation of GPL?

The Glaze tool that promised to be invisible to the naked eyes, but contained obvious AI generated artifacts? The same Glaze that reddit defeated in like a day after release?

Don't take anything this grifter says seriously, I'm surprised he hasn't been suspended for academic integrity violation yet.

[–] ElectroVagrant@lemmy.world 48 points 1 year ago (1 children)

Thanks for added background! I haven't been monitoring this area very closely so wasn't aware, but I'd have thought a publication that has been would then be more skeptical and at least mention some of this, particularly highlighting disputes over the efficacy of the Glaze software. Not to mention the others they talked to for the article.

Figures that in a space rife with grifters you'd have ones for each side.

[–] Zeth0s@lemmy.world 28 points 1 year ago* (last edited 1 year ago) (1 children)

Don't worry, it is normal.

People don't understand AI. Probably all articles I have read on it by mainstream media were somehow wrong. It often feels like reading a political journalist discussing about quantum mechanics.

My rule of thumb is: always assume that the articles on AI are wrong. I know it isn't nice, but that's the sad reality. Society is not ready for AI because too few people understand AI. Even AI creators don't fully understand AI (this is why you often hear about "emergent abilities" of models, it means "we really didn't expect it and we don't understand how this happened")

load more comments (1 replies)

[–] p03locke@lemmy.dbzer0.com 27 points 1 year ago (1 children)

who illegally stole GPLv3 code from an open source program called DiffusionBee for his proprietary Glaze software (reddit link), and when pressed, only released the code for the “front end” while still being in violation of GPL?

Oh, how I wish the FSF had more of their act together nowadays and were more like the EFF or ACLU.

[–] MargotRobbie@lemmy.world 26 points 1 year ago (1 children)

You should check out the decompilation they did on Glaze too, apparently it's hard coded to throw out a fake error upon detecting being ran on an A100 as some sort of anti-adversarial training measure.

[–] vidarh@lemmy.stad.social 11 points 1 year ago

That's hilarious, given that if these tools become remotely popular the users of the tools will provide enough adversarial data for the training to overcome them all by itself, so there's little reason to anyone with access to A100's to bother trying - they'll either be a minor nuisance used a by a tiny number of people, or be self-defeating.

[–] Dadifer@lemmy.world 13 points 1 year ago (1 children)

Thank you, Margot Robbie! I'm a big fan!

[–] MargotRobbie@lemmy.world 18 points 1 year ago

You're welcome. Bet you didn't know that I'm pretty good at tech too.

Also, that's Academy Award nominated character actress Margot Robbie to you!

[–] lloram239@feddit.de 91 points 1 year ago

"New snake oil to give artists a false sense of security" - The last of these tools I tried had absolutely zero effect on the AI, which is not exactly surprising given that there are hundreds of different ways to make use of image data as well as lots of completely different models. You'll never cover that all with some pixel twisting.

[–] Blaster_M@lemmy.world 59 points 1 year ago (1 children)

Oh no, another complicated way to jpeg an image that an ai training program will be able to just detect and discard in a week's time.

[–] vidarh@lemmy.stad.social 18 points 1 year ago

They don't even need to detect them - once they are common enough in training datasets the training process will "just" learn that the noise they introduce are not features relevant to the desired output. If there are enough images like that it might eventually generate images with the same features.

[+] MamboGator@lemmy.world 37 points 1 year ago* (last edited 7 months ago) (4 children)

[deleted]

[–] 0xD@infosec.pub 45 points 1 year ago (19 children)

I don't see a problem with it training on all materials, fuck copyright. I see the problem in it infringing on everyone's copyright and then being proprietary, monetized bullshit.

If it trains on an open dataset, it must be completely and fully open. Everything else is peak capitalism.

load more comments (19 replies)

[–] ElectroVagrant@lemmy.world 11 points 1 year ago

Until the law catches up with the technology, people need ways of protecting themselves.

I agree, and I wonder if the law might be kicked into catching up quicker as more companies try to adopt these tools and inadvertently infringe on other companies' copyrighted material. 😅

[–] 9thSun@midwest.social 11 points 1 year ago (8 children)

How is training AI with art on the web different to a person studying art styles? I'd say if the AI is being monetized in some capacity, then sure maybe there should be laws in place. I'm just hard-pressed to believe that anyone can have sole control of anything once it gets on the Internet.

[–] Zeth0s@lemmy.world 7 points 1 year ago* (last edited 1 year ago) (1 children)

I work in AI and I believe it is different. Society is built to distribute wealth, so that everyone can live a decent life. People and AI should be treated differently in front of the law. Also, non-commercial, open source AI should be treated differently than commercial or closed source models

[–] vidarh@lemmy.stad.social 9 points 1 year ago (6 children)

Society is built to distribute wealth, so that everyone can live a decent life.

As a goal, I admire it, but if you intend this as a description of how things are it'd be boundlessly naive.

load more comments (6 replies)

load more comments (7 replies)

[–] regbin_@lemmy.world 9 points 1 year ago* (last edited 1 year ago) (4 children)

Disagree. It's only unethical if you use it to generate the artist's existing pieces and claim it as yours.

load more comments (4 replies)

[–] leaky_shower_thought@feddit.nl 28 points 1 year ago (1 children)

I am sure we already got a budget version of this called the jpeg.

[–] seaQueue@lemmy.world 14 points 1 year ago (1 children)

Speaking of jpeg I miss the "needs more jpeg" bot that used to run on reddit, that shit was hilarious.

[–] gregorum@lemm.ee 10 points 1 year ago (1 children)

Reddit was Reddit for 18 fucking years. Just abandoning it leaves a massive hole. It’s gonna take a long time to fill it.

[–] HappycamperNZ@lemmy.world 7 points 1 year ago (1 children)

It really will.

Saying that, fuck spez

load more comments (1 replies)

[–] Vodik_VDK@lemmy.world 19 points 1 year ago

New CAPCHA just dropped.

[–] Kolanaki@yiffit.net 19 points 1 year ago (1 children)

"I can tell this is toxic by the pixels."

[–] ElectroVagrant@lemmy.world 10 points 1 year ago

"We like to call them poison pixels."

[–] wizardbeard@lemmy.dbzer0.com 18 points 1 year ago (2 children)

This is already a concept in the AI world and is often used while a model is being trained specifically to make it better. I believe it's called adversarial training or something like that.

[–] Mango@lemmy.world 12 points 1 year ago

No, that's something else entirely. Adversarial training is where you put an ai against a detector AI as a kind of competition for results.

[–] driving_crooner@lemmy.eco.br 8 points 1 year ago (1 children)

Its called adversarial attack, this is an old video (5 years) explaining how it works and how you can potentially do it charging just one pixel on the image.

https://youtu.be/SA4YEAWVpbk?si=xObPveXTT2ip5ICG

load more comments (1 replies)

[–] gregorum@lemm.ee 13 points 1 year ago (1 children)

Ooo, this is fascinating. It reminds me of that weird face paint that bugs out facial-recognition in CCTV cameras.

[–] seaQueue@lemmy.world 7 points 1 year ago* (last edited 1 year ago)

Or the patterned vinyl wraps they used on test cars that interferes with camera autofocus.

[–] uriel238@lemmy.blahaj.zone 13 points 1 year ago

I remember in the early 2010s reading an article like this one on openai.com talking about the dangers of using AI for image search engines to moderate against unwanted content. At the time the concern was CSAM salted to prevent its detection (along with other content salted with CSAM to generate false positives).

My guess is since we're still training AI with pools of data-entry people who tag pictures with what they appear to be, so that AI reads more into images than their human trainers (the proverbial man inside the Iron Turk).

This is going to be an interesting technology war.

[–] afraid_of_zombies@lemmy.world 12 points 1 year ago (1 children)

I am waiting for the day that some obsessed person starts finding ways to do like code injection in pictures.

load more comments (1 replies)

[–] guyrocket@kbin.social 10 points 1 year ago (5 children)

Invisible changes to pixels sound like pure BS to me. I'm sure others know more about it than i do but I thought pixels were very simple things.

[–] seaQueue@lemmy.world 25 points 1 year ago* (last edited 1 year ago)

"Invisible changes to pixels" means "a human can't tell the difference with a casual glance" - you can still embed a shit-ton of data in an image that doesn't look visually like it's been changed without careful inspection of the original and the new image.

If this data is added in certain patterns it will cause ML models trained against the image to draw incorrect conclusions. It's a technical hurdle that will slow a casual adversary, someone will post a model trained to remove this sometime soon and then we'll have a good old software arms race and waste a shit ton of greenhouse emissions adding and removing noise and training ever more advanced models to add and remove it.

You can already intentionally poison images so that image recognition draws incorrect conclusions fairly easily, this is the same idea but designed to cripple ML model training.

[–] Unaware7013@kbin.social 9 points 1 year ago (2 children)

I'm sure others know more about it than i do but I thought pixels were very simple things.

You're right, in that pixels are very simple things. However, you and I can't tell one pixel from another in an image, and at the scale of modern digital art (my girlfriend does hers at 300dpi), shifting a handful of pixels isn't going to make much of a visible difference to a person, but a LLM will notice them.

load more comments (2 replies)

load more comments (3 replies)

[+] RVMWSN@lemmy.ml 9 points 1 year ago* (last edited 1 year ago) (3 children)

[deleted]

[–] ElectroVagrant@lemmy.world 5 points 1 year ago

I generally don’t believe in intellectual property, I think it creates artificial scarcity and limits creativity. Of course the real tragedies in this field have to do with medicine and other serious business.

But still, artists claiming ownership of their style of painting is fundamentally no different. Why can’t I paint in your style? Do you really own it? Are you suggesting you didn’t base your idea mostly on the work of others, and no one in turn can take your idea, be inspired by it and do with it as they please? Do my means have to be a pencil, why can’t my means be a computer, why not an algorythm?

Limitations, limitations, limitations. We need to reform our system and make the public domain the standard for ideas (in all their forms). Society doesn’t treat artists properly, I am well aware of that. Generally creative minds are often troubled because they fall outside norms. There are many tragic examples. Also money-wise many artists don’t get enough credit for their contributions to society, but making every idea a restricted area is not the solution.

People should support the artists they like on a voluntary basis. Pirate the album but go to concerts, pirate the artwork but donate to the artist. And if that doesn’t make you enough money, that’s very unfortunate. But make no mistake: that’s how almost all artists live. Only the top 0.something% actually make enough money by selling their work, and that’s is usually the percentile that’s best at marketing their arts, in other words: it’s usually the industry. The others already depend upon donations or other sources of income.

We can surely keep art alive, while still removing all these artificial limitations, copying is, was and will never be in any way similar to stealing. Let freedom rule. Join your local pirate party.

Reformatted for easier readability.

load more comments (2 replies)

[–] ayaya@lemdro.id 6 points 1 year ago (15 children)

Obviously this is using some bug and/or weakness in the existing training process, so couldn't they just patch the mechanism being exploited?

Or at the very least you could take a bunch of images, purposely poison them, and now you have a set of poisoned images and their non-poisoned counterparts allowing you to train another model to undo it.

Sure you've set up a speedbump but this is hardly a solution.

load more comments (15 replies)

load more comments