Mateys! We have plundered the shores of tv shows and movies as these corporations flounder in stopping us seed and spread their files without regard for the flag of copyright. We have long plundered the shores of gaming and broke DRM that have been plaguing modern games, and allowing accessibility to games in countries where a game would cost a week or even a month of wages (I was once in this situation, so I am grateful for the pirating community for letting me enjoy the golden era of games back in 2012-2015).

But there, upon the horizon, lies a larger plunder. A kraken who guards a lair of untouched gold and emeralds, ready for the taking.

Closed-source AI models.

These corporations have stolen what was once ours, our own data, and put them in their AI models so that only they can profit off of it. These corporations raze the internet with their spiders and their bots to gather as much morsel of data from us which they can feed to their shiny new toy. We might not be able to stop them from stealing our data, but we have proven ourselves to be adept at copying things, leaking software, and this is what we need to do. AI is already too dangerous and to powerful for a select few corporations to control.

As long as AI is within the hands of corporations, not people, the AI will serve their goals, not ours. This needs to change, so this is what I propose for our next voyage.

you are viewing a single comment's thread
view the rest of the comments

[–] wolfshadowheart@kbin.social 9 points 2 years ago (11 children)

Okay, I'm with you but...

how are we using these closed source models?

As of right now I can go to civitai and get hundreds of models created by users to be used with Stable Diffusion. Are we assuming that these closed source models are even able to be run on localized hardware? In my experience, once you reach a certain size there's nothing that layusers can do on our hardware, and the corpos aren't using AI running on a 3080, or even a set of 4090's or whatever. They're using stacks of A100's with more VRAM than everyone's GPU in this thread.

If we're talking the whole of LLM's to include visual and textual based AI... Frankly, while I entirely support and agree with your premise, I can't quite see how anyone can feasibly utilize these (models). For the moment anything that's too heavy to run locally is pushed off to something like Collab or Jupiter and it'd need to be built with the model in mind (from my limited Collab understanding - I only run locally so I am likely wrong here).

Whether we'll even want these models is a whole different story too. We know that more data = more results but we also know that too much data fuzzes specifics. If the model is, say, the entirety of the Internet while it may sound good in theory in practice getting usable results will be hell. You want a model with specifics - all dogs and everything dogs, all cats, all kitchen and cookware, etc.

It's easier to split the data this way for the end user as this way we can direct the AI to put together an image of a German Shepard wearing a chefs had cooking in the kitchen, with the subject using the dog-Model and the background using the kitchen-Model.

So while we may even be able to grab these models from corpos, without the hardware and without any parsing, it's entirely possible that this data will be useless to us.

[–] aldalire@lemmy.dbzer0.com 1 points 2 years ago (1 children)

I was thinking the same thing. Would you think there’d be a way to take an existing model and pool our computational resources to produce a result?

All the AI models right now assume there is one beefy computer doing the inference, instead of multiple computers working in parallel. I wonder if there’s a way to “hack” existing models right now so it can be used to infer with multiple computers working in parallel.

Or maybe, a new type of AI should specifically be developed to be able to achieve this. But yes, getting the models is half the battle. The other half will be to figure out how to pool our computation to run the thing.

[–] wolfshadowheart@kbin.social 3 points 2 years ago

I'm not sure about for expanded models, but pooling GPU's is effectively what the Stable Diffusion servers have set up for the AI bots. Bunch of volunteers/mods run a SD public server and are used as needed - for a 400,000+ discord server I was part of moderating this is quite necessary to keep the bots running with a reasonable upkeep for requests.

I think the best we'll be able to hope for is whatever hardware MythicAI was working on with their analog chip.

Analog computing went out of fashion due to it's ~97% accuracy rate and need to be build for specific purposes. For example building a computer to calculate the trajectory of a hurricane or tornado - the results when repeated are all chaos but that's effectively what a tornado is anyway.

MythicAI went on a limb and the shortcomings of analog computing are actually strengths for readings models. If you're 97% sure something is a dog, it's probably a dog and the 3% error rate of the computer is lower than humans by far. They developed these chips to be used in cameras for tracking but the premise is promising for any LLM, it just has to be adapted for them. Because of the nature of how they were used and the nature of analog computers in general, they use way less energy and are way more efficient at the task.

Which means that theoretically one day we could see hardware-accelerated AI via analog computers. No need for VRAM and 400+ watts, MythicAI's chips can take the model request, sift through it, send that analog data to a digital converter and our computer has the data.

Veritasium has a decent video on the subject, and while I think it's a pipe dream to one day have these analog chips be integrated as PC parts, it's a pretty cool one and is the best thing that we can hope for as consumers. Pretty much regardless of cost it would be a better alternative to what we're currently doing, as AI takes a boatload of energy that it doesn't need to be taking. Rather than thinking about how we can all pool thousands of watts and hundreds of gigs of VRAM, we should be investigating alternate routes to utilizing this technology.

load more comments (9 replies)