this post was submitted on 29 Jun 2025
485 points (98.4% liked)
Linux Gaming
19615 readers
513 users here now
Discussions and news about gaming on the GNU/Linux family of operating systems (including the Steam Deck). Potentially a $HOME
away from home for disgruntled /r/linux_gaming denizens of the redditarian demesne.
This page can be subscribed to via RSS.
Original /r/linux_gaming pengwing by uoou.
No memes/shitposts/low-effort posts, please.
Resources
WWW:
Discord:
IRC:
Matrix:
Telegram:
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Sorry if I'm mostly focusing on paragraph 3 but I have to. MP3 CDs sound way worse than a redbook audio CD though. You can losslessly compress PCM by about 50% by using a codec like flac or alac, but there is data loss if you use a lossy format like .mp3. You can compress 20 vacation photos taken by an iPhone 16 to fit on a 1.44 mb floppy disk and you will have something resembling the original data, but I think you'll agree it's worse. Back to my original point, A CD-R is much more likely to reatain data for 5 years than an SSD is. Unless it's periodiclly powered on of couse. I have an HDD from 2008 in my PC actually. I'm often impressed how long they can last.
Sure, lossy compression is lossy, but that wasn't my point. My point was that data corruption in information-dense formats is more critical than in low-density formats.
To take your example of the vacation photos: If you have a 100 megapixel HDR photo and you lose 100 bytes of data, you will lose a few pixels and you won't even notice the change unless you zoom in quite far.
Compress these pictures down to fit on the floppy from your example (that would be ~73kb per photo), then losing 100 bytes of data will now be very noticeable in the picture, since you just lost ~0.1% of the whole data. Not taking the specifics of compression algorithms into account, you just lost 1 in every 1000 pixels, which is a lot.
High resolution low information density formats allow for quite a lot of damage before it becomes critical.
High information density formats on the other hand are quite vulnerable to critical data loss.
To show what I mean, take this image:
I saved it as BMP and then ran a script over it that replaces 1% of all bytes with a random byte. This is the result:
(I had to convert the result back to jpg to be able to upload it here.)
So even with a total of 99865 bytes replaced with random values, the image of an apple is clearly visible. There are a few small noise spots here and there, but the overall picture is still fine and if you print it as a photo, it's likely that these spots won't even be visible.
As a comparison, I now saved the original image as JPEG and also corrupted 1% of all bytes the same way. This here's the result. Gimp and many other file viewers can't open the file at all any more. Chrome can open it, and it looks like this:
The same happens with audio CDs. Audio CDs use uncompressed "direct" data, just like BMP. Data corruption only affects the data at the point of the corruption. That means, if one bit is unreadable, you probably won't be able to notice at all, and even if 1% of all data on the CD is corrupt, you will likely only notice a slightly elevated noise level, even though 1% data loss is an enormous amount.
If you instead use compressed formats (even FLAC) or if it's actual data and not media, a single illegible bit might destroy the whole file, because each bit of data depends on the information earlier in the file, so if one bit is corrupted, everything after that bit might become unreadable.
That's why your audio CD is still legible far beyond its expiry date, but a CD-R containing your backup data might not.
Again, these data retention time spans don't mean that after that time all data on the device disappears at once, but that until that time every single bit of data on your device is preserved. After that you might start to experience data loss, usually in the form of single bits or bytes failing.
Edit: Just for fun, this is what the BMP looks like with 95% corruption:
Even with this massive amount of damage, the image is still recognizable.
Edit 2: Due to a mistake in the script, this image is actually 61.3% corrupted, not 95%, but that's still a massive amount of corruption and the image is still clearly recognizable.
Fair enough, I misunderstood your argument. I appreciate your demonstration. Any chance you'd be willing to share your script? I have a few ideas on how to play with it.
Edit: I forgot, I actually had a HDD fail on me, luckily I was able to recover some of the data. Many .flac files on it were completely corrupted and unreadable past a certain point. The .aiff files I had were perfectly readable. I suspect they were at least partially corrupted. Luckily, I was able to re download all of the affected files. So, no data was actually lost.
If you run it, the first argument is the input file, the second one is the output file and the third is the percentage of corrupted bytes to inject.
I did spare the first 2000 bytes in the file to get clear of the file header (corruption on a BMP file header can still cause the whole image to be illegible, and this demonstration was about uncompressed vs compressed data, not about resilience of file headers).
I also just noticed when pasting the script that I don't check for double-corrupting the same bytes. At lower damage rates that's not an issue, but for the 95% example, it's actually 61.3% actual corruption.
Thanks, I'll make good use of it. I gotta to learn to write scripts like this.
I am not OP, but thanks a lot for a great educational post! Incredible how you can lose 95% of pixels from BMP and it still somewhat works.