this post was submitted on 24 Jun 2024
226 points (96.3% liked)
Technology
59219 readers
3235 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
If they are using GPL code, shouldn't they also release their source code?
That's the argument I would be making, but it certainly isn't Microsoft's (Copilot), OpenAI's (Codex), etc's position: they think the output is sufficiently laundered from the GPL training data so as not to constitute a derivative work (which means none of the original licenses -- "open source" or otherwise -- would apply, and the recipient could do whatever they want).
Edit: actually, to be more clear, I would take either of two positions:
That the presence of GPL (or in general, copyleft) code in the training dataset requires all output to be GPL (or in general, copyleft).
That the presence of both GPL code and code under incompatible licenses in the training dataset means that the AI output cannot legally be used at all.
(Position #2 seems more likely, as the license for proprietary code would be violated, too. It's just that I don't care about that; I only care about protecting the copyleft parts.)