Sam Altman is a lying hype-man. He deserves to see his company fail.
Privacy
A place to discuss privacy and freedom in the digital world.
Privacy has become a very important issue in modern society, with companies and governments constantly abusing their power, more and more people are waking up to the importance of digital privacy.
In this community everyone is welcome to post links and discuss topics related to privacy.
Some Rules
- Posting a link to a website containing tracking isn't great, if contents of the website are behind a paywall maybe copy them into the post
- Don't promote proprietary software
- Try to keep things on topic
- If you have a question, please try searching for previous discussions, maybe it has already been answered
- Reposts are fine, but should have at least a couple of weeks in between so that the post can reach a new audience
- Be nice :)
Related communities
much thanks to @gary_host_laptop for the logo design :)
This is why they killed that former employee.
Say his name y’all
Suchir Balaji
Sorry, wasn’t trying to be a dick. Just couldn’t think of it at the time.
Obligatory: I'm anti-AI, mostly anti-technology
That said, I can't say that I mind LLMs using copyrighted materials that it accesses legally/appropriately (lots of copyrighted content may be freely available to some extent, like news articles or song lyrics)
I'm open to arguments correcting me. I'd prefer to have another reason to be against this technology, not arguing on the side of frauds like Sam Altman. Here's my take:
All content created by humans follows consumption of other content. If I read lots of Vonnegut, I should be able to churn out prose that roughly (or precisely) includes his idiosyncrasies as a writer. We read more than one author; we read dozens or hundreds over our lifetimes. Likewise musicians, film directors, etc etc.
If an LLM consumes the same copyrighted content and learns how to copy its various characteristics, how is it meaningfully different from me doing it and becoming a successful writer?
Except the reason Altman is so upset has nothing to do with this very valid discussion.
As I commented elsewhere:
Fuck Sam Altmann, the fartsniffer who convinced himself & a few other dumb people that his company really has the leverage to make such demands.
He doesn't care about democracy, he's just scared because a chinese company offers what his company offers, but for a fraction of the price/resources.
He's scared for his government money and basically begging for one more handout “to save democracy”.
Yes, I’ve been listening to Ed Zitron.
Right. The problem is not the fact it consumes the information, the problem is if the user uses it to violate copyright. It’s just a tool after all.
Like, I’m capable of violating copyright in infinitely many ways, but I usually don’t.
The problem is that the user usually can't tell if the AI output is infringing someone's copyright or not unless they've seen all the training data.
If an LLM consumes the same copyrighted content and learns how to copy its various characteristics, how is it meaningfully different from me doing it and becoming a successful writer?
That is the trillion-dollar question, isn’t it?
I’ve got two thoughts to frame the question, but I won’t give an answer.
- Laws are just social constructs, to help people get along with each other. They’re not supposed to be grand universal moral frameworks, or coherent/consistent philosophies. They’re always full of contradictions. So… does it even matter if it’s “meaningfully” different or not, if it’s socially useful to treat it as different (or not)?
- We’ve seen with digital locks, gig work, algorithmic market manipulation, and playing either side of Section 230 when convenient… that the ethos of big tech is pretty much “define what’s illegal, so I can colonize the precise border of illegality, to a fractal level of granularity”. I’m not super stoked to come with an objective quantitative framework for them to follow, cuz I know they’ll just flow around it like water and continue to find ways to do antisocial shit in ways that technically follow the rules.
Yup. Violating IP licenses is a great reason to prevent it. According to current law, if they get Alice license for the book they should be able to use it how they want.
I'm not permitted to pirate a book just because I only intend to read it and then give it back. AI shouldn't be able to either if people can't.
Beyond that, we need to accept that might need to come up with new rules for new technology. There's a lot of people, notably artists, who object to art they put on their website being used for training. Under current law if you make it publicly available, people can download it and use it on their computer as long as they don't distribute it. That current law allows something we don't want doesn't mean we need to find a way to interpret current law as not allowing it, it just means we need new laws that say "fair use for people is not the same as fair use for AI training".
In your example, you could also be sued for ripping off his style.
You can sue for anything in the USA. But it is pretty much impossible to successfully sue for "ripping off someone's style". Where do you even begin to define a writing style?
There are lots of ways to characterize writing style. Go read Finnegans Wake and tell me James Joyce doesn't have a characteristic style.
"style", in terms of composition, is actually a component in proving plagiarism.
Edited for clarity: If that were the case then Weird AL would be screwed.
Original: In that case Weird AL would be screwed
No because what he does is already a settled part of the law.
That's the point. It's established law so OP wouldn't be sued
Please let it be over, yes.
Nobody even tries to write code from scratch anymore. I think it will have a lot of negative effects on programmers over time.
"Your proposal is acceptable."
I think it would be interesting as hell if they had to cite where the data was from on request. See if it's legitimate sources or just what a reddit user said five years ago
Darn