riskable

joined 2 years ago
[–] riskable@programming.dev 1 points 1 month ago

Good catch!

[–] riskable@programming.dev 95 points 1 month ago (16 children)

The real problem here is that Xitter isn't supposed to be a porn site (even though it's hosted loads of porn since before Musk bought it). They basically deeply integrated a porn generator into their very publicly-accessible "short text posts" website. Anyone can ask it to generate porn inside of any post and it'll happily do so.

It's like showing up at Walmart and seeing everyone naked (and many fucking), all over the store. That's not why you're there (though: Why TF are you still using that shithole of a site‽).

The solution is simple: Everyone everywhere needs to classify Xitter as a porn site. It'll get blocked by businesses and schools and the world will be a better place.

[–] riskable@programming.dev 2 points 1 month ago

"To solve this puzzle, you have to get your dog to poop in the circle..."

[–] riskable@programming.dev 9 points 1 month ago (1 children)

Yep. Stadia also had a feature like this (that no one ever used).

Just another example of why software patents should not exist.

[–] riskable@programming.dev 31 points 1 month ago* (last edited 1 month ago)

It's cold outside all year round and there's abundant geothermal energy. Basically, it's the perfect place to build data centers.

[–] riskable@programming.dev 4 points 1 month ago (1 children)

Working on (some) AI stuff professionally, the open source models are the only models that allow you to change the system prompt. Basically, that means that only open source models are acceptable for a whole lot of business logic.

Another thing to consider: There's models that are designed for processing: It's hard to explain but stuff like Qwen 3 "embedding" is made for in/out usage in automation situations:

https://huggingface.co/Qwen/Qwen3-Embedding-8B

You can't do that effectively with the big AI models (as much as Anthropic would argue otherwise... It's too expensive and risky to send all your data to a cloud provider in most automation situations).

[–] riskable@programming.dev 11 points 1 month ago (9 children)

This doesn't make sense when you look at it from the perspective of open source models. They exist and they're fantastic. They also get better just as quickly as the big AI company services.

IMHO, the open source models will ultimately what pops the big AI bubble.

[–] riskable@programming.dev 13 points 1 month ago (1 children)

Stick Enthusiasts!

[–] riskable@programming.dev 2 points 1 month ago* (last edited 1 month ago)

No, a .safetensors file is not a database. You can't query a .safetensors file and there's nothing like ACID compliance (it's read-only).

Imagine a JSON file that has only keys and values in it where both the keys and the values are floating point numbers. It's basically gibberish until you go through an inference process and start feeding random numbers through it (over and over again, whittling it all down until you get a result that matches the prompt to a specified degree).

How do the "turbo" models work to get a great result after one step? I have no idea. That's like black magic to me haha.

[–] riskable@programming.dev 4 points 1 month ago (3 children)

Or, with AI image gen, it knows that when some one asks it for an image of a hand holding a pencil, it looks at all the artwork in it's training database and says, "this collection of pixels is probably what they want".

This is incorrect. Generative image models don't contain databases of artwork. If they did, they would be the most amazing fucking compression technology, ever.

As an example model, FLUX.dev is 23.8GB:

https://huggingface.co/black-forest-labs/FLUX.1-dev/tree/main

It's a general-use model that can generate basically anything you want. It's not perfect and it's not the latest & greatest AI image generation model, but it's a great example because anyone can download it and run it locally on their own PC (and get vastly superior results than ChatGPT's DALL-E model).

If you examine the data inside the model, you'll see a bunch of metadata headers and then an enormous array of arrays of floating point values. Stuff like, [0.01645, 0.67235, ...]. That is what a generative image AI model uses to make images. There's no database to speak of.

When training an image model, you need to download millions upon millions of public images from the Internet and run them through their paces against an actual database like ImageNET. ImageNET contains lots of metadata about millions of images such as their URL, bounding boxes around parts of the image, and keywords associated with those bounding boxes.

The training is mostly a linear process. So the images never really get loaded into an database, they just get read along with their metadata into a GPU where it performs some Machine Learning stuff to generate some arrays of floating point values. Those values ultimately will end up in the model file.

It's actually a lot more complicated than that (there's pretraining steps and classifiers and verification/safety stuff and more) but that's the gist of it.

I see soooo many people who think image AI generation is literally pulling pixels out of existing images but that's not how it works at all. It's not even remotely how it works.

When an image model is being trained, any given image might modify one of those floating point values by like ±0.01. That's it. That's all it does when it trains on a specific image.

I often rant about where this process goes wrong and how it can result in images that look way too much like some specific images in training data but that's a flaw, not a feature. It's something that every image model has to deal with and will improve over time.

At the heart of every AI image generation is a random number generator. Sometimes you'll get something similar to an original work. Especially if you generate thousands and thousands of images. That doesn't mean the model itself was engineered to do that. Also: A lot of that kind of problem happens in the inference step but that's a really complicated topic...

[–] riskable@programming.dev 5 points 1 month ago

I'm ok with rich people getting charged more. But anyone who isn't making like $1 million/year should get the normal price.

[–] riskable@programming.dev 60 points 1 month ago (7 children)

This will definitely encourage more people to have kids.

view more: ‹ prev next ›