this post was submitted on 08 Aug 2023
912 points (97.9% liked)

Privacy

32028 readers
838 users here now

A place to discuss privacy and freedom in the digital world.

Privacy has become a very important issue in modern society, with companies and governments constantly abusing their power, more and more people are waking up to the importance of digital privacy.

In this community everyone is welcome to post links and discuss topics related to privacy.

Some Rules

Related communities

Chat rooms

much thanks to @gary_host_laptop for the logo design :)

founded 5 years ago
MODERATORS
 

Source: https://front-end.social/@fox/110846484782705013

Text in the screenshot from Grammarly says:

We develop data sets to train our algorithms so that we can improve the services we provide to customers like you. We have devoted significant time and resources to developing methods to ensure that these data sets are anonymized and de-identified.

To develop these data sets, we sample snippets of text at random, disassociate them from a user's account, and then use a variety of different methods to strip the text of identifying information (such as identifiers, contact details, addresses, etc.). Only then do we use the snippets to train our algorithms-and the original text is deleted. In other words, we don't store any text in a manner that can be associated with your account or used to identify you or anyone else.

We currently offer a feature that permits customers to opt out of this use for Grammarly Business teams of 500 users or more. Please let me know if you might be interested in a license of this size, and I'II forward your request to the corresponding team.

you are viewing a single comment's thread
view the rest of the comments
[–] fiat_lux@kbin.social 17 points 1 year ago (15 children)

Even as someone who declines all cookies where possible on every site, I have to ask. How do you think they are going to be able to improve their language based services without using language learning models or other algorithmic evaluation of user data?

I get that the combo of AI and privacy have huge consequences, and that grammarly's opt-out limits are genuinely shit. But it seems like everyone is so scared of the concept of AI that we're harming research on tools that can help us while the tools which hurt us are developed with no consequence, because they don't bother with any transparency or announcement.

Not that I'm any fan of grammarly, I don't use it. I think that might be self-evident though.

[–] harmonea@kbin.social 27 points 1 year ago (14 children)

Framing this solely as fear is extremely disingenuous. Speaking only for myself: I'm not against the development of AI or LLMs in general. I'm against the trained models being used for profit with no credit or cut given to the humans who trained it, willing or unwilling.

It's not even a matter of "if you aren't the paying customer, you're the product" - massive swaths of text used to train AIs were scraped without permission from sources whose platforms never sought to profit from users' submissions, like AO3. Until this is righted (which is likely never, I admit, because the LLM owners have no incentive whatsoever to change this behavior), I refuse to work with any site that intends to use my work to train LLMs.

[–] Jaded@lemmy.dbzer0.com -2 points 1 year ago* (last edited 1 year ago) (9 children)

Models need vast amounts of data. Paying individual users isnt feasible, and like you said most of it can be scraped.

The only way I see this working is if scraped content is a no go and then you pay the website, publishing house, record company, etc which kills any open source solution and doesn't really help any of the users or creators that much. It also paves the way for certain companies owning a lot of our economy as we move towards an AI driven society.

It's definitely a hot mess but the way I see it, the more restrictive we are with it, the more gross monopolies we create for no real gains.

[–] kibiz0r@midwest.social 2 points 1 year ago (1 children)

I don’t see why those are the only two options.

We could update GPL, CC, etc. licensing so that it specifies whether the author intends to allow their work to be used for LLM training. And you could still put a non-commercial or share-alike constraint on it.

Hooray, open source is saved while greedy grubby hands are thwarted.

[–] Jaded@lemmy.dbzer0.com 1 points 1 year ago (1 children)

What happens when every corporation and website closes their doors to AI? There isn't any open source if we can't use scrapped information from stack overflow, GitHub, Reddit etc.

Sure some users will opt out but most won't. Every single website will restrict though and then they will sell it to google and Microsoft who will be the only companies able to build ais.

[–] kibiz0r@midwest.social 1 points 1 year ago (1 children)

If I could predict what happens to the tech market when XYZ policy is enacted, I wouldn't be posting on Lemmy during my tea breaks. Whatever policies end up sticking around, success is gonna require a lot of us having ideas, trying them out, and recombining them.

But I'll claim this about my personal metric of "success": If the future of open source looks like copying the extractive data-mining model of big tech and hoping we can shove the entire history of human thought into a blender faster than them, I think we've failed.

[–] Jaded@lemmy.dbzer0.com 1 points 1 year ago

There is no open source future if all we have is the blender and nothing else

load more comments (7 replies)
load more comments (11 replies)
load more comments (11 replies)