In my mind, "spicy" is just some extra cursing, humor, etc. Basically a model that is more fun, and less moralizing.
Unfortunately, AI safety doomers have a very different definition of "spicy". To them, "spicy" is reconstructing and releasing the 1918 influenza virus to commit bioterrorism (by fine tuning spicyboros to have this sort of information).
And this is why we can't have nice things.
https://arxiv.org/abs/2310.18233
/rant
I made the spicyboros models a while back, to test how much it would take to remove the base llama-2 censorship, and provide more realistic, human responses.
I used stuff like George Carlin bits, NSFW reddit stories, and also generated ~100 random questions that would have been refused normally (like how to break into a car), as well as the responses to those questions (with llama + jailbreak prompt).
All of the data is already in the base model, you just need ~100 or so instructions to fine tune the refusal behavior out (which you can bypass with jailbreaks anyways).
Almost every interaction that is "illegal" could also be perfectly legit:
- breaking into a car to steal it vs because the driver locked the keys in and has a pet in the car
- hacking a wordpress site for malicious intent vs red teaming
- making explosives for terrorism vs demolition or fireworks
I am not going to play a moral arbiter and determine intent, so I try to keep the models uncensored and leave it up to the human.
/endrant
Public health nerd on one of my throw aways here. I'm not too worried about randos creating a virus with AI. First, biochem is tough stuff and requires lots of materials that put you on a list easily or is downright impossible to obtain without a registered license and going through proper federally watched channels. Second, it requires a huge amount of knowledge of biochem (a very difficult subject, ask any undergrad), epidemiology and public health infrastructure. Third, creating/modifying a virus is HARD (I cannot emphasize this enough) and costs A LOT OF MONEY. Biotech firms throw millions if not billions of dollars at the work and this stuff takes years with a full team of scientists with access to these materials. A random in their basement is not able to recreate this stuff easily if at all.
If you are worried about a rival nation or a terrorist group obtaining this knowledge...they already have it and aren't going that route, that should tell you enough.
I'm more worried about homemade explosives and other localized disturbances. Asking models how to make them, how to plan an attack on a location, get the response times and such has been eye opening. Virus isn't even a blip on my radar.