this post was submitted on 23 Mar 2025

773 points (97.8% liked)

Technology

71537 readers

5192 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

L3s@lemmy.world

enu@lemmy.world

technopagan@lemmy.world

L4s@lemmy.world

L3s@hackingne.ws

L4s@hackingne.ws

773

Dad demands OpenAI delete ChatGPT’s false claim that he murdered his kids (arstechnica.com)

submitted 2 months ago by OpenPassageways@lemmy.zip to c/technology@lemmy.world

139 comments fedilink hide all child comments

A Norwegian man said he was horrified to discover that ChatGPT outputs had falsely accused him of murdering his own children.

According to a complaint filed Thursday by European Union digital rights advocates Noyb, Arve Hjalmar Holmen decided to see what information ChatGPT might provide if a user searched his name. He was shocked when ChatGPT responded with outputs falsely claiming that he was sentenced to 21 years in prison as "a convicted criminal who murdered two of his children and attempted to murder his third son," a Noyb press release said.

you are viewing a single comment's thread
view the rest of the comments

[–] HK65@sopuli.xyz 6 points 2 months ago (5 children)

From the GDPR's standpoint, I wonder if it's still personal information if it is made up bullshit. The thing is, this could have weird outcomes. Like for example, by the letter of the law, OpenAI might be liable for giving the same answer to the same query again.

[–] FiskFisk33@startrek.website 5 points 2 months ago (1 children)

then again

but it also mixed "clearly identifiable personal data"—such as the actual number and gender of Holmen's children and the name of his hometown—with the "fake information,"

The made up bullshit aside, this should be a quite clear indicator of an actual GDPR breach

[–] Petter1@lemm.ee 1 points 2 months ago (1 children)

Maybe he has a insta profile with the name of his kids in his bio

How would that be a GDPR breach?

[–] FiskFisk33@startrek.website 8 points 2 months ago* (last edited 2 months ago) (1 children)

Maybe he has a insta profile with the name of his kids in his bio

Irrelevant. The data being public does not make it up for grabs.

‘Personal data’ means any information relating to an identified or identifiable natural person (‘data subject’);

They store his personal data without his permission.

also

Information that is inaccurately attributed to a specific individual, be it factually incorrect or information that in reality is related to another individual, is still considered personal data as it relates to that specific individual. If data are inaccurate to the point that no individual can be identified, then the information is not personal data.

Storing it badly, does not make them excempt.

[–] Petter1@lemm.ee 2 points 2 months ago* (last edited 2 months ago) (2 children)

If you run an chatbot with with integrated web search, it garbs that info as a web crawler does, it does not mean that this data really is in the “knowledge/statistics” of the AI itself.

Nobody stores the information if it is like this, it is only temporary used to generate that specific output.

(You can not use chatGPT without websearch on chatgpt domain (only if you self host, or use a service like DDG))

[–] HK65@sopuli.xyz 0 points 2 months ago

That is another great question. If it is transformative use of the primary data source, then that is likely illegal, as nobody gave permission for them to transform and process that personal data. If it is not transformative, and it just gives access to the primary source like a search engine on the other hand, then the problem is that if it returns copyrighted data, it is no longer fair use most likely.

[–] FiskFisk33@startrek.website -1 points 2 months ago (1 children)

That's a good point, that muddies the waters a bit. Makes it hard to say wether it's spouting info from the web or if it's data from the model.

I can't comment on actual legality in this case, but I feel handling personal data like this, even from the open web, in a context where hallucinations are an overwhelming possibility, is still morally wrong. I don't know the GDPR well enough to say wether it covers temporary information like this, but I kinda hope it does.

[–] Petter1@lemm.ee 2 points 2 months ago (1 children)

Lol, I definitely hope not 🤪 imagine a web without search engines, with GDPR counting for temporary information as well, it would not be feasible to offer.

[–] FiskFisk33@startrek.website -1 points 2 months ago (1 children)

hmm, true enough. But in my mind there's a clear difference between showing information unedited and referring to its source, and this.

[–] Petter1@lemm.ee 1 points 2 months ago

Most LLM these days show what they searched for generating the post, but not many seem to manually validate the summary of the LLM by clicking on those links…

[–] rottingleaf@lemmy.world 1 points 2 months ago

Funny how everyone around laughs at free speech when it's for humans, but when it's a text generator, then suddenly there are some abstract principles preventing everyone to sue the living crap out of all "AI" companies, at least until they are bleeding enough to start putting disclaimers brighter than in Vegas that it's a word salad machine that doesn't think, know, claim, dispute, judge or reason.

[–] Petter1@lemm.ee 1 points 2 months ago

Isn’t that a great tool to generate nonsense datasets to poison big data of trackers somehow 🤔

[–] zipzoopaboop@lemmynsfw.com 0 points 2 months ago

And it's llm owners problem to figure out how to fix

[–] MagicShel@lemmy.zip 0 points 2 months ago (1 children)

They can just put in a custom regex to filter out certain things. It'll be a bit performative since it does nothing to stop novel misinformation, but it would prevent it from saying what it's legally required not to say.

Well, it wouldn't really, it would say it and just hide it under a message saying it violates boundaries. It's all a bunch of performative bullshit, actually.

For example, the things it's required not to say would actually be perfectly fine in the realm of fiction or satire or a game of Simon says, but that'll be disallowed, as well, because the model can't actually tell the difference.

[–] HK65@sopuli.xyz 1 points 2 months ago

Yeah, but the problem is that the "certain things" can actually encompass "any data about any person". That's a hard regex to write.