this post was submitted on 27 May 2024
1101 points (98.0% liked)

Technology

59295 readers
4310 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
 

You know how Google's new feature called AI Overviews is prone to spitting out wildly incorrect answers to search queries? In one instance, AI Overviews told a user to use glue on pizza to make sure the cheese won't slide off (pssst...please don't do this.)

Well, according to an interview at The Vergewith Google CEO Sundar Pichai published earlier this week, just before criticism of the outputs really took off, these "hallucinations" are an "inherent feature" of  AI large language models (LLM), which is what drives AI Overviews, and this feature "is still an unsolved problem."

you are viewing a single comment's thread
view the rest of the comments
[–] Metype@lemmy.world 25 points 5 months ago (5 children)

So you have a product that you've made into a system for getting answers. And then you couldn't be bothered to try and sanitize training data enough to get your answer system's new headline feature from spreading blatantly incorrect information? If it doesn't work, maybe don't ship it.

[–] xavier666@lemm.ee 5 points 5 months ago (2 children)

I think the problem they are facing is data quantity. Sanitizing possibly terabytes of text data is a humongous task. They have probably used an AI to do the cleanup but the more suble errors have passed through the filter.

[–] bignate31@lemmy.world 1 points 5 months ago (1 children)

Yeah, the problem is how to sanitise effectively. You've gotta be able to find a way to automatically strip out "bad" things from your training data (via an "oracle"). But if you already had that oracle, you could just slap it on your final product (e.g. Search) and make all the "bad" things disappear before they hit the user (via some sort of filter).

[–] xavier666@lemm.ee 2 points 5 months ago

I'm pretty sure google's final solution will be using mechanical turks

load more comments (2 replies)