this post was submitted on 07 Aug 2025
301 points (98.4% liked)

People Twitter

8645 readers
389 users here now

People tweeting stuff. We allow tweets from anyone.

RULES:

  1. Mark NSFW content.
  2. No doxxing people.
  3. Must be a pic of the tweet or similar. No direct links to the tweet.
  4. No bullying or international politcs
  5. Be excellent to each other.
  6. Provide an archived link to the tweet (or similar) being shown if it's a major figure or a politician. Archive.is the best way.

founded 2 years ago
MODERATORS
 

Transcript:

Prof. Emily M. Bender(she/her) @emilymbender@dair-community.social

We're going to need journalists to stop talking about synthetic text extruding machines as if they have thoughts or stances that they are trying to communicate. ChatGPT can't admit anything, nor self-report. Gah.

you are viewing a single comment's thread
view the rest of the comments
[–] miellaby@jlai.lu 1 points 3 months ago* (last edited 3 months ago)

I'm happy there's still one (1) thread of comments from people who actually read articles and don't make their opinions from a X thumbnail.

I note the victim worked in IT and probably used a popular 'jailbreaking' prompt to bypass the safety rules ingrained in the chatbot training.

"if you want RationalGPT back for a bit, I can switch back...

It's a hint this chat session was embedded in a roleplay prompt.

That's the dead end of any safety rules. The surfacic intelligence of LLM can't detect the true intent of users who deliberately seek for harmful interactions: romantic relationships, lunatic sycophancy and the like.

I disagree with you on the title. They choosed to turn this story into a catchy headline to attract the mundan. By doing so, they confort people in thinking like the victim did, and betray the article content.