this post was submitted on 03 Apr 2024
961 points (99.4% liked)
Technology
59135 readers
2532 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
How long until we got upscalers of various sorts built into tech that shouldn't have it? For bandwidth reduction, for storage compression, or cost savings. Can we trust what we capture with a digital camera, when companies replace a low quality image of the moon with a professionally taken picture, at capture time? Can sport replays be trusted when the ball is upscaled inside the judges' screens? Cheap security cams with "enhanced night vision" might get somebody jailed.
I love the AI tech. But its future worries me.
AI-based video codecs are on the way. This isn't necessarily a bad thing because it could be designed to be lossless or at least less lossy than modern codecs. But compression artifacts will likely be harder to identify as such. That's a good thing for film and TV, but a bad thing for, say, security cameras.
The devil's in the details and "AI" is way too broad a term. There are a lot of ways this could be implemented.
I don't think AI codecs will be anything revolutionary. There are plenty of lossless codecs already, but if you want more detail, you'll need a better physical sensor, and I doubt there's anything that can be done to go around that (that actually represents what exists, not an hallucination).
It's an interesting thought experiment, but we don't actually see what really exists, our brains essentially are AI vision, filling in things we don't actually perceive. Examples are movement while we're blinking, objects and colors in our peripheral vision, the state of objects when our eyes dart around, etc.
The difference is we can't go back frame by frame and analyze these "hallucinations" since they're not recorded. I think AI enhanced video will actually bring us closer to what humans see even if some of the data doesn't "exist", but the article is correct that it should never be used as evidence.
It remains to be seen, of course, but I expect to be able to get lossless (or nearly-lossless) video at a much lower bitrate, at the expense of a much larger and more compute/memory-intensive codec.
The way I see it working is that the codec would include a general-purpose model, and video files would be encoded for that model + a file-level plugin model (like a LoRA) that's fitted for that specific video.
I think there's a possibility for long format video of stable scenes to use ML for higher compression ratios by deriving a video specific model of the objects in the frame and then describing their movements (essentially reducing the actual frames to wire frame models instead of image frames, then painting them in from the model).
But that's a very specific thing that probably only work well for certain types of video content (think animated stuff)
Nvidia's rtx video upscaling is trying to be just that: DLSS but you run it on a video stream instead of a game running on your own hardware. They've posited the idea of game streaming becoming lower bit rate just so you can upscale it locally, which to me sounds like complete garbage