You most certainly can. The discussion about whether copyright applies to the output is nuanced but certainly valid, and notably separate from whether copyright allows copyright holders to restrict who or what gets trained on their work after it's released for general consumption.
joe
The article is literally about someone suing to prevent their art from being used for training. That's the topic at hand.
Are you confused, or are you trying to shoehorn a different but related discussion into this one?
I was under the impression we were talking about using copyright to prevent a work from being used to train a generative model. There's nothing in copyright that says anything about training anything. I'm not even convinced there should be.
There's nothing in copyright law that covers this scenario, so anyone that says it's "absolutely" one way or the other is telling you an opinion, not a fact.
Hey, I was up front about my data (or lack thereof) and we're not talking about climate change or string theory, we're talking about fast food delivery driver's onboarding.
"The Internet" would just state it like a fact.
Are you saying that traditional food delivery drivers get trained specifically not to hit on people when they deliver food? I don't have any data but I feel like that's not really a thing. Maybe my concept of the training a good delivery driver gets is way off the mark?
I'm also pretty sure that it's easier to give a bad review that others will see via one of these food delivery apps than it is if you go directly to the business.
I think we all agree that this is inappropriate and should not be happening, I just don't see how it doesn't apply at least equally to traditional delivery drivers.
Yeah I read that but I don't have the knowledge to say "what a rookie mistake" or "in hindsight that was a bad idea". I take it, it's the former?
I'm not a cybersecurity expert. Did they make a foolish decision that would warrant a lack of trust, or were they just unlucky?
I can't say I fully understand how LLMs work (can't anyone??) but I know a little and your comment doesn't seem to understand how they use training data. They don't use their training data to "memorize" sentences, they use it as an example (among billions) of how language works. It's still just an analogy, but it really is pretty close to LLMs "learning" a language by seeing it used over and over. Keeping in mind that we're still in an analogy, it isn't considered "derivative" when someone learns a language from examples of that language and then goes on to write a poem in that language.
Copyright doesn't even apply, except perhaps on extremely fringe cases. If a journalist put their article up online for general consumption, then it doesn't violate copyright to use that work as a way to train a LLM on what the language looks like when used properly. There is no aspect of copyright law that covers this, but I don't see why it would be any different than the human equivalent. Would you really back up the NYT if they claimed that using their articles to learn English was in violation of their copyright? Do people need to attribute where they learned a new word or strengthened their understanding of a language if they answer a question using that word? Does that even make sense?
Here is a link to a high level primer to help understand how LLMs work: https://www.understandingai.org/p/large-language-models-explained-with
I might be off base, but your comment has the feel of a "gotcha!". Yeah, America certainly qualifies.
Edit: Perhaps work pointing out that I'm not the first person you replied to.
You can disable it to install stuff if you want.
Are you speaking legally or morally when you say someone "aught" to do something?