this post was submitted on 04 Apr 2024
181 points (91.0% liked)
Technology
59135 readers
2313 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
So I can totally see this happening. Government contracts with an genAI company and company drops the ball and erroneously includes the function for pornography or doesn't select the correctly curated training data (I'm unsure how exactly these work). It may be quite difficult to spot this error by the Washington government is the occurrence rate is very low or none of their test training data prompted pornography to be generated. Perhaps it was only keyed to make porn (when not specifically prompted to) on certain subsets of matched facial features? I'm not suggesting this, but perhaps that affected user looks a lot like a popular porn star? It could also totally be the government's fault for quickly selecting an AI package and not looking what it could do; but with government bureaucracy there could've been quite a few people with oversight.
My bigger question is WTF is this system even doing? If you win money in the lottery, you can select to apply it to a vacation package if your random draw hits it? Why wouldn't you just take the money and buy your own? Maaaaybe if it heavily discounts the vacations or something. Seems like an unnecessary step in the lottery process.
It's a core problem with image generator LLMs. For some fucking reason they seem to have fed them the content from sites that had a lot of porn. Guessing Imgur and Deviantart.
Literally the first time I tried to use MS's image generator, was out with some friends trying a new fried chicken place and we were discussing fake tinder profiles.
So I thought to try it and make a fake image of "woman senuously eating fried chicken".
Content warning, blah blah blah.
Try "Man sensuously eating fried chicken". Works fine.
We were all mystified by that. I went back a few days later to play around. Tried seeing what it didn't like. Tried generating "woman relaxing at park".
Again, content warning. Switch to a man, no problem. Eventually got it to generate with "woman enjoying sunset in a park." Got a very dark image, because it generated a completely nude woman T-posing in the dark.
So, with that in hand I went back and started specifying "fully clothed" for a prompt involving the word "woman". All of a sudden all of the prompts worked. They fed the bot so much porn that it defaulted women to being nude.
Lol at t-posing pornography.
I find the same problem when searching for D&D portraits. Men? Easy and varied. Women? Hypersexualized and mostly naked. I usually have to specific old women to prevent that.
To be fair, D&D was historically a game for neckbeards (at least that was the stigma/stereotype), so hypersexualized women fits the bill.
Doesn't it also have to do with the previous requests the LLM has recieved? In order for this thing to "learn" it has to know what people are looking for, so i've always imagined the porn problem as being a result of the fact that people are using these things to generate porn at a much greater volume than anything else, especially porn of women, so it defaults to nude because that's what most requests were looking for
Nah, most of these generative models don’t account for previous requests. There would be some problems if they did. I read somewhere that including generative AI data in generative AI training has a feedback effect that can ruin models.
It’s just running a bunch of complicated math against previously trained algorithms.