this post was submitted on 27 Nov 2023
1 points (100.0% liked)

Data Hoarder

1 readers
1 users here now

We are digital librarians. Among us are represented the various reasons to keep data -- legal requirements, competitive requirements, uncertainty of permanence of cloud services, distaste for transmitting your data externally (e.g. government or corporate espionage), cultural and familial archivists, internet collapse preppers, and people who do it themselves so they're sure it's done right. Everyone has their reasons for curating the data they have decided to keep (either forever or For A Damn Long Time (tm) ). Along the way we have sought out like-minded individuals to exchange strategies, war stories, and cautionary tales of failures.

founded 1 year ago
MODERATORS
 

Hello everyone,

I would love to hear how you deal with the organisation of your files as well as the backup.

Do you use software to organise files and to remove duplicates?

How do you store and backup your data? Do you use the 3-2-1 rule?

What is your process, software and hardware?

I myself am still struggling to find a way to do this in a secure and effective way, I would love to hear your experience and advice on this!

top 2 comments
sorted by: hot top controversial new old
[–] Far_Marsupial6303@alien.top 1 points 11 months ago

I'm on Windows so I use Everything as my search and VVV (Virtual Volumes View) for an offline searchable database. I also keep copies of TV show episode and apperances lists from Wikipedia.

With my drive organization, which is a variation on animal, vegetable. mineral*, I can find anything within seconds as long as I know something about it. More below.

*Animal, vegetable, mineral is based on the idea that everything can be primarily catagorized into one of those three catagories. So in my case, if I know a movie is by a certain director, stars a certain actress or was a certain type of show, I can go directly to that drive or folder.

I don't use RAID because I like to keep my drives separate and just do a 1:1 swapout when one fails without any rebuild time. This does leave a lot of unused slack space and I do have to upgrade my drive size every so often, but by that time I'm ready to retire my active drives to backup anyway.

I have two backups. An exact set of mirror drives and my second backup is spread over 3 & 4TB drives. Unfortunately I don't have anywhere to store them physically offsite and could is too expensive for my 200TB raw hoard. Everything is verified after copy and every few years, re-verify the integrity of the files with ViceVersa to ensure they're bit of for bit accurate. Unfortunately, I didn't save the HASH(es) the first go around, but am now doing it during the re-verify and initial copy.

Note: Always copy, never move your files and always verify! Odd things can happen if you move! There are those that say, correctly that a move on the same drive just rewrites the location to the File Allocation Table, but I still never move unless the file is completely unimportant.

For finding duplicates, Czwaka is highly recommended here for all file types. I've been using Video Duplicator for years and will continue to use it since I have the Pro version.

My drive organization, is 20 dedicated drives ranging from 8-14TB. Each drive/set of drives is for:

Directors - Alpha by name
Actresses - Alpha by name
Music - Sub categorized into groups/soloists and type of show (Reality, Variety, Special)
Variety Shows
Reality Shows
Specials/Documentaries/Shorts/Collections
Movies with sequels - Regardless of director

[–] Is-Not-El@alien.top 1 points 11 months ago

Do you use software to organise files and remove duplicates?

Yes and no. I use my two hands + one eye (the other doesn’t work) + find, du, rsync (in comparison mode) and diff. Everything I have is on a FreeBSD VM (docs, pictures) + Ubuntu VM (movies) so native tools work great for me.

How do you store and backup your data?

I have multiple Proxmox hosts that are not in a cluster. All of them and their VMs are being backed up on a physically separate Proxmox Backup Server (PBS) locally every day. In addition to that this system does rsnapshot backups on all of the VMs and hosts I care about. So it has 2 different storage backends (disk pools) - 1 for rsnapshot and 1 for Proxmox Backup Server. Both are 20TB in size and one uses XFS and the other uses EXT4. Every day the PBS backups are synced to a remote PBS instance in Hetzner Germany (my systems are in Bulgaria). The Hetzner system has 40TB in raidz2 (ZFS) and keeps all of the backups for 2 years. My storage VMs are also configured to do snapshots every 15 minutes and keep them for 24h. Yearly I also download my most important data and write it to BD discs and store those in my summer house.

So in short I have the following: Local snapshots every 15 minutes, 2 daily backups on an independent system using different software and storage backends, 1 daily remote backup on a system in another country, 1 yearly offline backup. It’s not perfect but it’s affordable and somewhat secure.