Programming

27076 readers

532 users here now

Welcome to the main community in programming.dev! Feel free to post anything relating to programming here!

Cross posting is strongly encouraged in the instance. If you feel your post or another person's post makes sense in another community cross post into it.

Hope you enjoy the instance!

Rules

Follow the programming.dev instance rules
Keep content related to programming in some way
If you're posting long videos try to add in some form of tldr for those who don't want to watch videos

Wormhole

Follow the wormhole through a path of communities !webdev@programming.dev

founded 3 years ago

MODERATORS

snowe@programming.dev

Ategon@programming.dev

UlrikHD@programming.dev

bugsmith@programming.dev

Spyro@programming.dev

210

Fed up with vibe coders, dev sneaks data-nuking prompt injection into their code (arstechnica.com)

submitted 1 day ago by cm0002@lemy.lol to c/programming@programming.dev

43 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] litchralee@sh.itjust.works 45 points 1 day ago (2 children)

The person who coined the term "prompt injection" has the same gripe, because the original term genuinely did mean an attack using untrusted user input, a la SQL injection. But it's been conflated with jailbreak attacks in general, muddying the term.

Example of a bona fide prompt injection: white text in the background of a resume PDF, attacking a job application portal that uses LLMs to filter applicants. No privilege escalation is involved to give the candidate top marks on their resume screening.

Whereas a non-prompt injection jailbreak would be bypassing a safety filter, such as how Morse code might get past the filter and allow a user to request other people's cryptocurrency be transfered away. This is more akin to finding a poorly-secured, public facing API and then exploiting it.

[–] pixxelkick@lemmy.world 16 points 23 hours ago

By that definition this is a prompt injection then, its adding a "hidden" prompt that is obscured from the human in order to change the behavior of the AI to do something else malicious.

[–] Wirlocke@lemmy.blahaj.zone 9 points 21 hours ago

Finding a poorly-secured public facing API is exactly how injections work, whether it's SQL or prompts. If I put SQL commands in a username field and it works, it's still an SQL injection even if it's just developer incompetence.

The difference between that and prompt injection is that unfiltered LLM inputs are basically the standard at the moment, so it takes next to no effort.

Plus I think the Morse code example is far more clever and exploits the LLM directly, whereas the white text trick has been around long before widespread LLMs.