this post was submitted on 14 Sep 2023

235 points (96.4% liked)

Technology

59402 readers

4015 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related content.
Be excellent to each another!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, to ask if your bot can be added please contact us.
Check for duplicates before posting, duplicates may be removed

Approved Bots

founded 1 year ago

MODERATORS

235

A long list of tech companies are rushing to give themselves the right to use people's data to train AI (www.businessinsider.com)

submitted 1 year ago by L4s@lemmy.world to c/technology@lemmy.world

17 comments fedilink hide all child comments

A long list of tech companies are rushing to give themselves the right to use people's data to train AI::More companies are quietly giving themselves permission to use consumer data to train generative AI models and tools.

top 17 comments

sorted by: hot top controversial new old

[–] orca@orcas.enjoying.yachts 50 points 1 year ago (1 children)

[–] ElPussyKangaroo@lemmy.world 4 points 1 year ago

Tethics

[–] BertramDitore@lemmy.world 32 points 1 year ago

Fuck all this noise. I “give myself” the right to never say a nice thing about these duplicitous asshats. I wish I could update a document to make it legal to steal shit from millions of people.

Ethics don’t matter to these pieces of crap, despite all the pandering. Don’t ever trust what they say. Just watch what they do, and more importantly, what they don’t do.

[–] 3TH4Li4@feddit.ch 13 points 1 year ago (4 children)

Give or take how much time do you guys think until Meta or any other big corp starts scraping and selling data from the Fediverse?

[–] TheGoldenGod@lemmy.world 13 points 1 year ago (1 children)

Last I checked, they already are.

[–] 3TH4Li4@feddit.ch 6 points 1 year ago (1 children)

Well that's depressing to know

[–] TheGoldenGod@lemmy.world 7 points 1 year ago

It is, they seem dead set on enshitification. Though, based on the past corporate attempts dealing with piracy, there’s a good chance it won’t stick.

[–] admin@lemmy.my-box.dev 10 points 1 year ago

I thought that was the whole point of Threads.net. I still don't understand why lemmyWorld hasn't blocked them.

[–] wmassingham@lemmy.world 6 points 1 year ago

I'd be surprised if they aren't already. Facebook is already implementing ActivityPub in Threads.

[–] treadful@lemmy.zip 3 points 1 year ago

Are you implying that's an issue? We freely publish these comments for everyone to use equally.

[–] autotldr@lemmings.world 8 points 1 year ago

This is the best summary I could come up with:

Over the last couple of months, companies as varied as Twitter, or X, Microsoft, Instacart, Meta, and Zoom have rushed to update their terms of service and/or privacy policies to allow the collection of information and content from people and customers as data to train generative artificial intelligence models.

Tweets, web searches and apparently even grocery shopping are now an opportunity for companies to build more predictive tools like Bard and ChatGPT, which is owned by OpenAI and receives considerable backing from Microsoft.

Users were only prompted to review updated Terms in September, in an email from the company announcing its partnership with OpenAI as "a new third-party sub-processor."

However, Instacart also added language that left it a window to do just that with its own customers' data, saying its license now allows it to "...otherwise enhance our machine learning algorithms, for the purposes of operating, providing, and improving the services."

"We're incorporating generative-AI experiences into our products to assist with customers' grocery shopping questions and help them make food-related decisions," the spokesperson said.

At the end of August, it created a simple form where users could "request" to opt out of their data being used to train AI models.

The original article contains 1,151 words, the summary contains 200 words. Saved 83%. I'm a bot and I'm open source!

[–] alienanimals@lemmy.world 7 points 1 year ago

Tech companies continue to steal people's data without any repercussions.

[–] dsco@lemmy.dbzer0.com 5 points 1 year ago

Backpfeifengesicht x1000

[–] Blackdoomax@sh.itjust.works 4 points 1 year ago

The very least they could do is allowing those people to use their AI.

[–] Drop_All_Users@lemmy.world -5 points 1 year ago (1 children)

I don't see the issue with this, don't give your data to companies if you don't them to use it. No one is forcing us to use these services, if you don't want twitter to train their AI off your tweets, then don't tweet.

[–] AceFuzzLord@lemm.ee 10 points 1 year ago (1 children)

The problem is if you signed up for an account for a social media service years ago and they suddenly decide (without telling you or getting your consent) to start training off your data, there is nothing you can do if you don't know it's happening.

If the admins of the largest Lemmy instance didn't tell us that they were gonna use our posts and comments as AI training data and everybody was none the wiser, how would we react? We wouldn't until someone finds out and spills the beans.

[–] Drop_All_Users@lemmy.world 2 points 1 year ago

I'm actually fine with it, in the case of Lemmy this is all public data, whether or not Lemmy admins are training AI on it, there is nothing to stop me from training my own AI models with this data.

I think the larger issue is I don't consider it "your data" once you put it on one of these sites. As soon as you take your own thought and put it on facebook/instagram/reddit/whatever, it's now theirs, it lives in their databases, and frankly for a social media company it's probably their most valuable asset.

No one is forcing anyone to use social media, if you want your thoughts and actions to be your own I would recommend not putting them on the internet.