184
this post was submitted on 11 Jun 2023
184 points (100.0% liked)
Technology
37720 readers
520 users here now
A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.
Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.
Subcommunities on Beehaw:
This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Meta's main income stream is data mining and they will take advantage of federation to collect data (not metadata, but human-generated content is still very valuable for AI model training) of users on federated instances. Any content that federates over to this instance will be cached on Meta servers where they can do whatever they like with it. There is no legal data protection framework for content retrieved from federated networks and Meta's lawyers will try to argue that federating with this platform counts as giving consent to the platform's TOS. Meta platforms introduce lots of advertising and bots to the network. Don't just ignore this platform, give them the Gab treatment.
If a large corp wants to do what you’re suggesting, they don’t need to launch a big announced project.
They can spin up a federated instance with just one user and no references to who owns it, then have patsy accounts on other instances subscribe to their instance and get all the data they want sent to their semi secret instance.
It would be very difficult to identify this in a large, healthy federation with tons of users and lots of small personal instances.
Anyone can scrape data and corporations are already doing it. But data scraping is considered a legal gray zone and companies can be prevented from accessing data that they are not legally authorized to use, which is why companies like OpenAI retrieve their training data from data dumps and don't just run web crawlers across the entire internet. A publicly announced platform with an appropriate clause in its Terms of Service can grant Meta the legal ownership of all data from the fediverse that arrives on their platform.