this post was submitted on 29 Apr 2024

195 points (94.9% liked)

Technology

73416 readers

7457 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

L3s@lemmy.world

enu@lemmy.world

technopagan@lemmy.world

L4s@lemmy.world

L3s@hackingne.ws

L4s@hackingne.ws

195

ChatGPT provides false information about people, and OpenAI can’t correct it (noyb.eu)

submitted 1 year ago by alb_004@lemm.ee to c/technology@lemmy.world

61 comments fedilink hide all child comments

top 50 comments

sorted by: hot top controversial new old

[–] jol@discuss.tchncs.de 42 points 1 year ago (1 children)

Stop asking a language model for accurate information and problem solved. ChatGPT is not supposed to be a knowledge bank, that's purely incidental for the amount of training data.

[–] NeoNachtwaechter@lemmy.world 9 points 1 year ago (1 children)

Stop asking a language model for accurate information and problem solved

Hey chatgpt, when did jol's wife get pregnant and by whom?

[–] jol@discuss.tchncs.de 2 points 1 year ago (1 children)

Unless they used that bitche's only fans in the training data, it will definitely not know that.

[–] lightnegative@lemmy.world 4 points 1 year ago (1 children)

It doesn't need to know the real answer to produce a confident sounding answer

[–] NeoNachtwaechter@lemmy.world 3 points 1 year ago

And if that answer contains Elon Musk, the world is going to believe it no matter what.

[–] SlopppyEngineer@lemmy.world 23 points 1 year ago

And by the time the system can actually research the facts, the internet is so full of LLM generated nonsense neither human or AI can verify the data.

[–] givesomefucks@lemmy.world 23 points 1 year ago (12 children)

If scientists made AI, then it wouldn't be an issue for AI to say "I don't know".

But capitalists are making it, and the last thing you want is it to tell an investor "I don't know". So you tell it to make up bullshit instead, and hope the investor believes it.

It's a terrible fucking way to go about things, but this is America...

[–] expr@programming.dev 39 points 1 year ago (7 children)

It's got nothing to do with capitalism. It's fundamentally a matter of people using it for things it's not actually good at, because ultimately it's just statistics. The words generated are based on a probability distribution derived from its (huge) training dataset. It has no understanding or knowledge. It's mimicry.

It's why it's incredibly stupid to try using it for the things people are trying to use it for, like as a source of information. It's a model of language, yet people act like it has actual insight or understanding.

load more comments (7 replies)

[–] VeganCheesecake@lemmy.blahaj.zone 28 points 1 year ago (9 children)

Uh, I understand the sentiment, but the model doesn't know anything. And it's legit really hard to differentiate between factual things and random bullshit it made up.

[–] DudeDudenson@lemmings.world 18 points 1 year ago (1 children)

Was gonna say, the AI doesn't make up or admit bullshit, its just a very advanced a prediction algorithm. It responds with what the combination of words that is most likely the expected answer.

Wether that is accurate or not is part of training it but you'll never get 100% accuracy to any query

[–] maynarkh@feddit.nl 1 points 1 year ago (3 children)

If it can name what the most likely combination is, couldn't it also know how likely that combination of words is?

[–] DudeDudenson@lemmings.world 7 points 1 year ago* (last edited 1 year ago)

It's not actually deciding anything, the AI thinking is marketing fluff really. But yes that's called confidence rating and it does. But at the scale of something like chatgpt that uses a snapshot of the entire internet and is non mutable there's no way to train it for every possible question. If you ask about a topic 99% of the internet gets wrong it'll respond the wrong thing with 99% confidence

[–] kent_eh@lemmy.ca 3 points 1 year ago

If it has been trained using questionable sources, or if it's training data includes sarcastic responses (without understanding that context), it isn't hard to imagine how confidently wrong some of the responses could be.

[–] wahming@monyet.cc 3 points 1 year ago

No, because that requires it to understand the words. It doesn't.

[–] Bishma@discuss.tchncs.de 8 points 1 year ago (1 children)

Yeah, no one can make it say "I don't know" because it is not really AI. Business bros decided to call it that and everyone smiled and nodded. LLMs are 1 small component (maybe) of AI. Maybe 1/80th of a true AI or AGI.

Honestly the most impressive part of LLMs is the tokenizer that breaks down the request, not the predictive text button masher that comes up with the response.

[–] Kichae@lemmy.ca 10 points 1 year ago

Honestly the most impressive part of LLMs is the tokenizer that breaks down the request, not the predictive text button masher that comes up with the response.

Yes, exactly! It's ability to parse the input is incredible. It's the thing that has that "wow" factor, and it feels downright magical.

Unfortunately, that also makes people intuitively trust its output.

load more comments (7 replies)

[–] DarkThoughts@fedia.io 6 points 1 year ago* (last edited 1 year ago) (6 children)

This has nothing to do with scientists vs capitalists and everything with the fact that this is not actually "AI". Someone called it T9 (word prediction) on steroids and I find that much more fitting with how those LLMs work. It just mimics the way humans talk, but it doesn't actually converse intelligently or actually understands context - it just looks like it does, but only if you take it at face value and don't look deeper into it.

load more comments (6 replies)

[–] howrar@lemmy.ca 2 points 1 year ago

It is made by scientists. And we don't know how to make the model determine whether or not it knows something. So far, we only have tools that tell us that something probably wasn't in the training set (e.g. using variance across models in a mixture of experts setup), but that doesn't tell us anything about how correct it is.

load more comments (8 replies)

[–] filister@lemmy.world 21 points 1 year ago (1 children)

Just ask ChatGPT what it thinks for some non-existing product and it will start hallucinating.

This is a known issue of LLMs and DL in general as their reasoning is a black box for scientists.

[–] db0@lemmy.dbzer0.com 21 points 1 year ago (1 children)

It's not that their reasoning is a black box. It's that they do not have reasoning! They just guess what the next word in the sentence is likely to be.

[–] GiveMemes@jlai.lu 4 points 1 year ago (2 children)

I mean it's a bit more complicated than that, but at its core, yes, this is correct. Highly recommend this video.

https://www.youtube.com/watch?v=wjZofJX0v4M

[–] kureta@lemmy.ml 2 points 1 year ago

it's not even a little bit more complicated than that. They are literally trained to predict the next token given a series of previous tokens. The way that they do that is very complicated and the amount of data they are trained on is huge. That's why they have to give correct information sometimes to sound plausible. Providing accurate information is literally a side effect of the actual thing they are trained to do.

[–] PipedLinkBot@feddit.rocks 1 points 1 year ago

Here is an alternative Piped link(s):

https://www.piped.video/watch?v=wjZofJX0v4M

Piped is a privacy-respecting open-source alternative frontend to YouTube.

I'm open-source; check me out at GitHub.

[–] cley_faye@lemmy.world 17 points 1 year ago

Asking chatgpt for information is like asking for accurate reports from bards and minstrels. Sure, sometimes it fits, but most of it is random stuff stitched together to sound good.

[–] RidcullyTheBrown@lemmy.world 14 points 1 year ago (1 children)

There we go. Now that people have calmed their proverbial tits about these thinking machines, we can start talking maturely about the strengths and limitation of the LLM implementations and find their niche in our tools arsenal.

[–] warmaster@lemmy.world 7 points 1 year ago (2 children)

I can't wait until the AI bubble finally pops.

[–] Excrubulent@slrpnk.net 2 points 1 year ago

I've got bad news for you though: there will be another new bubble almost immediately. There's a whole industry based around tech hype cycles and they are constantly throwing shit at the wall to see what sticks. Eventually something will when there's space for it. It will be just as insufferable as LLMs are, and crypto was before that, and... I actually forget what was before that. Uber? You won't be able to escape it, because it will dominate the attention economy.

[–] RidcullyTheBrown@lemmy.world 1 points 1 year ago* (last edited 1 year ago)

There's definitely a niche for it, more so than for other fruitless hypes like blockchain or IoT. We really need to be able to offload tasks which need autonomous decisions of simple to average complexity to machines. We can't continuously scale up the population to handle those. But LLMs aren't the answer to that, unfortunately. They're just party tricks if the current limitations cannot be overcome.

[–] NeoNachtwaechter@lemmy.world 12 points 1 year ago (1 children)

No surprise, and this is going to happen to everybody who uses neural net models for production. You just don't know where your data is, and therefore it is unbelievably hard to change data.

So, if you have legal obligations to know it, or to delete some data, then you are deep in the mud.

[–] erv_za@lemmy.world 1 points 1 year ago (1 children)

I think of ChatGPT as a "text generator", similar to how Dall-E is an "image generator".
If I were openai, I would post a fictitious person disclaimer at the bottom of the page and hold the user responsible for what the model does. Nobody holds Adobe responsible when someone uses Photoshop.

[–] NeoNachtwaechter@lemmy.world 3 points 1 year ago (5 children)

I would post a fictitious person disclaimer

... or you could read the GDPR and learn that such excuses are void.

load more comments (5 replies)

[–] yamanii@lemmy.world 5 points 1 year ago

The technology has to follow the legal requirements, not the other way around.

That should be obvious to everyone that's not an evangelist.

load more comments