this post was submitted on 14 Oct 2025
47 points (100.0% liked)

Cybersecurity

8955 readers
54 users here now

c/cybersecurity is a community centered on the cybersecurity and information security profession. You can come here to discuss news, post something interesting, or just chat with others.

THE RULES

Instance Rules

Community Rules

If you ask someone to hack your "friends" socials you're just going to get banned so don't do that.

Learn about hacking

Hack the Box

Try Hack Me

Pico Capture the flag

Other security-related communities !databreaches@lemmy.zip !netsec@lemmy.world !securitynews@infosec.pub !cybersecurity@infosec.pub !pulse_of_truth@infosec.pub

Notable mention to !cybersecuritymemes@lemmy.world

founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] sandman2211@sh.itjust.works 4 points 3 months ago

Probably some variant of this:

https://easyaibeginner.com/the-dr-house-jailbreak-hack-how-one-prompt-can-break-any-chatbot-and-beat-ai-safety-guardrails-chatgpt-claude-grok-gemini-and-more/

I can't get any of these to output a set of 10 steps to build a docker container that does X or Y without 18 rounds of back and forth troubleshooting. While I'm sure it will give you "10 steps on weaponizing cholera" or "Build your own suitcase nuke in 12 easy steps!" I really doubt it would actually work.

The easiest way to secure this kind of harmful knowledge from abuse would probably be to purposefully include a bunch of bad data in the training model so it remains incapable of providing a useful answer.