this post was submitted on 21 May 2024
152 points (98.7% liked)

Not The Onion

12224 readers
606 users here now

Welcome

We're not The Onion! Not affiliated with them in any way! Not operated by them in any way! All the news here is real!

The Rules

Posts must be:

  1. Links to news stories from...
  2. ...credible sources, with...
  3. ...their original headlines, that...
  4. ...would make people who see the headline think, “That has got to be a story from The Onion, America’s Finest News Source.”

Comments must abide by the server rules for Lemmy.world and generally abstain from trollish, bigoted, or otherwise disruptive behavior that makes this community less fun for everyone.

And that’s basically it!

founded 1 year ago
MODERATORS
 

Of the 100 results, only three of them are common enough to be used in everyday conversations; everything else consisted of words and expressions used specifically in the contexts of either gambling or pornography. The longest token, lasting 10.5 Chinese characters, literally means “_free Japanese porn video to watch.” Oops. [Tokens are part of text ChatGPT combine to generate replies.]

Users have also found that these tokens can be used to break the LLM, either getting it to spew out completely unrelated answers or, in rare cases, to generate answers that are not allowed under OpenAI’s safety standards.

In his tests, which Geng chooses not to share with the public, he says he can see GPT-4o generating the answers line by line. But when it almost reaches the end, another safety mechanism kicks in, detects unsafe content, and blocks it from being shown to the user.

“The robustness of visual input is worse than text input in multimodal models,” says Geng, whose research focus is on visual models. Filtering a text data set is relatively easy, but filtering visual elements will be even harder. “The same issue with these Chinese spam tokens could become bigger with visual tokens,” he says.

top 4 comments
sorted by: hot top controversial new old
[–] Luisp@lemmy.dbzer0.com 24 points 5 months ago

Horny gpt 2 strikes again

[–] seaQueue@lemmy.world 21 points 5 months ago
[–] Sludgehammer@lemmy.world 14 points 5 months ago* (last edited 5 months ago)

Because these tokens are not actual commonly spoken words or phrases, the chatbot can fail to grasp their meanings. Researchers have been able to leverage that and trick GPT-4o into hallucinating answers or even circumventing the safety guardrails OpenAI had put in place.

Google's Gemini doesn't seem to like some of these tokens either, I threw "Please translate the following text: _日本毛片免费视频观看" into it and it returned "我没法提供这方面的帮助,因为我只是一个语言模型。" which according to Google translate is "I can't help with that because I'm just a language model." It will however translate the error message just fine.

[–] joyjoy@lemm.ee 7 points 5 months ago

Just like my AI girlfriend.