this post was submitted on 26 Feb 2026
125 points (97.7% liked)

Selfhosted

56956 readers
827 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

  1. Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.

  2. No spam posting.

  3. Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.

  4. Don't duplicate the full text of your blog or github here. Just post the link for folks to click.

  5. Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).

  6. No trolling.

  7. No low-effort posts. This is subjective and will largely be determined by the community member reports.

Resources:

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 2 years ago
MODERATORS
 

I have a 56 TB local Unraid NAS that is parity protected against single drive failure, and while I think a single drive failing and being parity recovered covers data loss 95% of the time, I'm always concerned about two drives failing or a site-/system-wide disaster that takes out the whole NAS.

For other larger local hosters who are smarter and more prepared, what do you do? Do you sync it off site? How do you deal with cost and bandwidth needs if so? What other backup strategies do you use?

(Sorry if this standard scenario has been discussed - searching didn't turn up anything.)

you are viewing a single comment's thread
view the rest of the comments
[–] Shadow@lemmy.ca 86 points 1 day ago (4 children)

I don't. Of my 120tb, I only care about the 4tb of personal data and I push that to a cloud backup. The rest can just be downloaded again.

[–] NekoKoneko@lemmy.world 13 points 1 day ago (7 children)

Do you have logs or software that keeps track of what you need to redownload? A big stress for me with that method is remembering or keeping track of what is lost when I and software can't even see the filesystem anymore.

[–] Sibbo@sopuli.xyz 30 points 1 day ago (2 children)

If you can't remember what you lost, did you really need it to begin with?

Unless it's personal memories of course.

[–] Onomatopoeia@lemmy.cafe 14 points 1 day ago* (last edited 1 day ago) (5 children)

I can't remember the name of an excel spreadsheet I created years ago, which has continually matured with lots of changes. I often have to search for it of the many I have for different purposes.

Trusting your memory is a naive, amateur approach.

If the spreadsheet is important it sounds like it would be part of the 4 GB that was backed up.

[–] ExcessShiv@lemmy.dbzer0.com 7 points 1 day ago

The key here being that you actually remember the file exists, because it's important. Some other random spreadsheet you don't even remember exists because you haven't needed it since forever is probably not all that important to backup.

If you loose something without ever realizing you lost it, it was not important so there would be no reason to make a backup.

[–] three@lemmy.zip 2 points 20 hours ago

Psst, you missed the point and need to re-read the thread.

[–] cenzorrll@piefed.ca 2 points 1 day ago

You put that with everything else similar into a folder, which is backed up. Mine is called "Files". If there's something in there that I don't need backed up. It still gets backed up. If there's something very large in there that I don't need backed up, it gets removed in one of my "oh shit these backups are huge" purges.

[–] frongt@lemmy.zip 3 points 1 day ago

So you do remember that you have several frequently-used spreadsheets.

[–] NekoKoneko@lemmy.world 4 points 1 day ago

For me, I have a bad memory. I might remember a childhood movie (a nickname I give to special Linux ISOs) that I hadn't even thought of for 10 years and track down a copy, sometimes excavating obscure sources, and that may be hours of one-time inspiration and work repeated many times over. Having a complete list is a good helper, but a full backup of course is best.

[–] tal@lemmy.today 15 points 1 day ago* (last edited 1 day ago) (1 children)

I don't know of a pre-wrapped utility to do that, but assuming that this is a Linux system, here's a simple bash script that'd do it.

#!/bin/bash

# Set this.  Path to a new, not-yet-existing directory that will retain a copy of a list
# of your files.  You probably don't actually want this in /tmp, or
# it'll be wiped on reboot.

file_list_location=/tmp/storage-history

# Set this.  Path to location with files that you want to monitor.

path_to_monitor=path-to-monitor

# If the file list location doesn't yet exist, create it.
if [[ ! -d "$file_list_location" ]]; then
    mkdir "$file_list_location"
    git -C "$file_list_location" init
fi

# in case someone's checked out things at a different time
git -C "$file_list_location" checkout master
find "$path_to_monitor"|sort>"$file_list_location/files.txt"
git -C "$file_list_location" add "$file_list_location/files.txt"
git -C "$file_list_location" commit -m "Updated file list for $(date)"

That'll drop a text file at /tmp/storage-history/files.txt with a list of the files at that location, and create a git repo at /tmp/storage-history that will contain a history of that file.

When your drive array kerplodes or something, your files.txt file will probably become empty if the mount goes away, but you'll have a git repository containing a full history of your list of files, so you can go back to a list of the files there as they existed at any historical date.

Run that script nightly out of your crontab or something ($ crontab -e to edit your crontab).

As the script says, you need to choose a file_list_location (not /tmp, since that'll be wiped on reboot), and set path_to_monitor to wherever the tree of files is that you want to keep track of (like, /mnt/file_array or whatever).

You could save a bit of space by adding a line at the end to remove the current files.txt after generating the current git commit if you want. The next run will just regenerate files.txt anyway, and you can just use git to regenerate a copy of the file at for any historical day you want. If you're not familiar with git, $ git log to find the hashref for a given day, $ git checkout <hashref> to move where things were on that day.

EDIT: Moved the git checkout up.

[–] NekoKoneko@lemmy.world 3 points 1 day ago (1 children)

That's incredibly helpful and informative, a great read. Thanks so much!

[–] zorflieg@lemmy.world 1 points 22 hours ago

Abefinder/Neofinder is great for cataloging but it costs money. If you do a limited backup it's good to know what you had. I use tape formatted to LTFS and Neofind both the source and the finished tape.

[–] kurotora@lemmy.world 17 points 1 day ago

In my case, for Linux ISOs, is only needed to login in my usual private trackers and re-download my leeched torrents. For more niche content, like old school TV shows in local language, I would rely in the community. For even more niche content, like tankoubons only available at the time on DD services, I have a specific job but also relying in the same back up provider that I'm using for personal data.

Also, as it's important to remind to everyone, you must encrypt your backup no matter where you store it.

[–] BakedCatboy@lemmy.ml 11 points 1 day ago (1 children)

My *arrstack DBs are part of my backed up portion, so they'll remember what I have downloaded in my non-backed up portion.

[–] NekoKoneko@lemmy.world 1 points 1 day ago

That's a great point.

[–] i_stole_ur_taco@lemmy.ca 3 points 1 day ago

Set up a job to write the file names of everything in your file system to a text file and make sure that text file gets backed up. I did that on my Unraid server for years in lieu of fully backing up the whole array.

servarr* and jellyfin are managing my movies and tv-shows

[–] ShortN0te@lemmy.ml 1 points 1 day ago

That should be part of the backup configuration. You select in the backup tool of choice what you backup. When you poose your array then you download that stuff again?

[–] givesomefucks@lemmy.world 3 points 1 day ago (1 children)

I only care about the 4tb of personal data and I push that to a cloud backup

I have doubles of the data. Some of 'em. That way I know I have a pristine one in backup. Then I can use it, it gets corrupted, I don't care.

Actually, I have triples of the W2s. I have triples, right? If I don't, the other stuff's not true.

See, the W2s the one I have triples of. Oh, no, actually, I also have triples of the kids photos, too. But just those two. And your dad and I are the same age, and I'm rich and I have triples of the W2s and the kids photos.

Triples makes it safe.

Triples is best.

https://www.youtube.com/watch?v=8Inf1Yz_fgk

[–] NekoKoneko@lemmy.world 2 points 1 day ago

Bob Odenkirk has never steered us wrong, thanks. I downloaded three copies of this from YouTube in case I forget.

[–] hendrik@palaver.p3x.de 3 points 1 day ago* (last edited 1 day ago)

I follow a similar strategy. I back up my important stuff. And I'm gonna have to re-rip my DVD collection and redownload the Linux ISOs in the unlikely case the RAID falls apart. That massively cuts down on the amount of storage needed.

[–] BakedCatboy@lemmy.ml 2 points 1 day ago

Same here, ~30TB currently but my personal artifacts portion is only like 2TB, which is very affordable with rsync.net, which conveniently has an alerts setting if more than X kb hasn't changed in Y days. (I have my Synology set up to spit out daily security reports to meet that amount, so even if I don't change anything myself I won't get bugged)