this post was submitted on 17 Aug 2023
436 points (96.2% liked)

Technology

59635 readers
4230 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
 

cross-posted from: https://nom.mom/post/121481

OpenAI could be fined up to $150,000 for each piece of infringing content.https://arstechnica.com/tech-policy/2023/08/report-potential-nyt-lawsuit-could-force-openai-to-wipe-chatgpt-and-start-over/#comments

you are viewing a single comment's thread
view the rest of the comments
[–] ArmokGoB@lemmy.dbzer0.com 14 points 1 year ago (2 children)

I disagree. I think that there should be zero regulation of the datasets as long as the produced content is noticeably derivative, in the same way that humans can produce derivative works using other tools.

[–] adrian783@lemmy.world 1 points 1 year ago (1 children)

LLM are not human, the process to train LLM is not human-like, LLM don't have human needs or desires, or rights for that matter.

comparing it to humans has been a flawed analogy since day 1.

[–] synceDD@lemmy.world 2 points 1 year ago

Llm no desires = no derivative works? Let llm handle your comments they will make more sense

[–] HelloHotel@lemmy.world 1 points 1 year ago* (last edited 1 year ago) (1 children)

Good in theory, Problem is if your bot is given too mutch exposure to a specific piece of media and when the "creativity" value that adds random noise (and for some setups forces it to improvise) is too low, you get whatever impression the content made on the AI, like an imperfect photocopy (non expert, explained "memorization"). Too high and you get random noise.

[–] ArmokGoB@lemmy.dbzer0.com 2 points 1 year ago

if your bot is given too mutch exposure to a specific piece of media and when the “creativity” value that adds random noise (and for some setups forces it to improvise) is too low, you get whatever impression the content made on the AI, like an imperfect photocopy

Then it's a cheap copy, not noticeably derivative, and whoever is hosting the trained bot should probably take it down.

Too high and you get random noise.

Then the bot is trash. Legal and non-infringing, but trash.

There is a happy medium where SD, MJ, and many other text-to-image generators currently exist. You can prompt in such a way (or exploit other vulnerabilities) to create "imperfect photocopies," but you can also create cheap, infringing works with any number of digital and physical tools.