this post was submitted on 21 Dec 2023
35 points (94.9% liked)

Linux

47342 readers
1617 users here now

From Wikipedia, the free encyclopedia

Linux is a family of open source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991 by Linus Torvalds. Linux is typically packaged in a Linux distribution (or distro for short).

Distributions include the Linux kernel and supporting system software and libraries, many of which are provided by the GNU Project. Many Linux distributions use the word "Linux" in their name, but the Free Software Foundation uses the name GNU/Linux to emphasize the importance of GNU software, causing some controversy.

Rules

Related Communities

Community icon by Alpár-Etele Méder, licensed under CC BY 3.0

founded 5 years ago
MODERATORS
 

So I have a nearly full 4 TB hard drive in my server that I want to make an offline backup of. However, the only spare hard drives I have are a few 500 GB and 1 TB ones, so the entire contents will not fit all at once, but I do have enough total space for it. I also only have one USB hard drive dock so I can only plug in one hard drive at a time, and in any case I don't want to do any sort of RAID 0 or striping because the hard drives are old and I don't want a single one of them failing to make the entire backup unrecoverable.

I could just play digital Tetris and just manually copy over individual directories to each smaller drive until they fill up while mentally keeping track of which directories still need to be copied when I change drives, but I'm hoping for a more automatic and less error prone way. Ideally, I'd want something that can automatically begin copying the entire contents of a given drive or directory to a drive that isn't big enough to fit everything, automatically round down to the last file that will fit in its entirety (I don't want to split files between drives), and then wait for me to unplug the first drive and plug in another drive and specify a new mount point before continuing to copy the remaining files, using as many drives as necessary to copy everything.

Does anyone know of something that can accomplish all of this on a Linux system?

all 15 comments
sorted by: hot top controversial new old
[–] iwasgodonce@lemmy.world 12 points 9 months ago
[–] Molecule5076@lemmy.world 10 points 9 months ago (2 children)

Something like mergerfs? I think this is what Unraid uses if I remember right.

https://github.com/trapexit/mergerfs

[–] rambos@lemmy.world 7 points 9 months ago* (last edited 9 months ago) (1 children)

If OP cant use more than one disk at once, how can they benefit from mergerfs?

[–] Molecule5076@lemmy.world 4 points 9 months ago

Yeah you’re right. Scratch that then

[–] HiddenLayer5@lemmy.ml 1 points 9 months ago
[–] AbidanYre@lemmy.world 6 points 9 months ago* (last edited 9 months ago)

Git annex can do that and keep track of which drive the files are on.

https://git-annex.branchable.com/

[–] FigMcLargeHuge@sh.itjust.works 4 points 9 months ago

It's going to take a little work here, but I have a large drive on my plex, and a couple of smaller drives that I back everything up to. On the large drive get a list of the main folders. You can do a "du -h --max-depth=1 | sort -hk1" on the root folder to get an idea of how you should split them up. Once you have an idea, make two files, each with their own list of folders (eg: folders1.out and folders2.out) that you want to go to each separate drive. If you have both of the smaller drives mounted, just execute the rsync commands, otherwise, just do each rsync command with the corresponding drive mounted. Here's an example of my rsync commands. Keep in mind I am going from an ext4 filesystem to a couple of ntfs drives, which is why I use the size only. Make sure and do a dry run or two, and you may or may not want the '--delete' command in there. Since I don't want to keep files I have deleted from my plex, I have it delete them on the target drive also.

sudo rsync -rhi --delete --size-only --progress --stats --files-from=/home/plex/src/folders1.out /media/plex/maindrive /media/plex/4tbbackup

sudo rsync -rhi --delete --size-only --progress --stats --files-from=/home/plex/src/folders2.out /media/plex/maindrive /media/plex/other4tbdrive

[–] restlessyet@discuss.tchncs.de 2 points 9 months ago

I ran into the same problem some months ago when my cloud backups stopped being financially viable and I decided to recycle my old drives. For offline backups mergerfs will not work as far as I understand. Creating tar archives of 130TB+ also doesnt sound like a good option. Some of the tape backup solutions looked to be possible options, but are often complex and use special archive formats...

I ended up writing my own solution in python using json state files. It's complete enough to run the backup, but otherwise very work-in-progress with no restore at all. So I do not want to publish it.

If you find a suitable solution I am also very interested 😅

[–] captcha@hexbear.net 2 points 9 months ago (1 children)

Im going to say that doesnt exist and restoring from it would be a nightmare. You could cobble together a shell or python script that does that though.

You're better off just getting a drive bay and plugging all the drives in at once as an LVM.

You could also do the opposite, which is split the 4TB into the different logical volumes. Each the same size as a drive.

[–] lemmyvore@feddit.nl 1 points 9 months ago

It wouldn't be so complicated to restore as long as they keep full paths and don't split up subdirectories. But yeah, sounds like they'd need a custom tool to examine their dirs and do a solve a series of knapsack problems.

[–] Deckweiss@lemmy.world 2 points 9 months ago* (last edited 9 months ago)

If you are lucky enough, borgbackup could deduplicate and compress the data enough to fit a 1tb drive. Depending on the content of course, but it's deduplication & compression is really insanely efficient for certain cases. (I have 3 devices with ~900GB each (so just shy of 3TB in total) which all gets stored in a ~400gb borgbackup)

[–] Squid@leminal.space 2 points 9 months ago* (last edited 9 months ago)

You'll have ask the question of how important is this data, then before you start run drive diagnostic tool to see if all are functioning as expected, I'd suggest moving directories aposed to chopping anything up as to maintain some form of redundancy if a drive were to fail. It'll be a long process. Hope it goes well

Resync is a handy tool

[–] Sina@beehaw.org 1 points 9 months ago

This is really is not a good idea for a backup.

[–] retrieval4558@mander.xyz 1 points 9 months ago

Probably not the answer you're looking for but I'd probably build a dedicated nas.