NetBSD-Users archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: LTO support



Pouya Tafti <pouya+lists.netbsd%nohup.io@localhost> writes:

> I'm looking for a low cost offsite backup solution for my teeny local
> NAS (couple of TiB of redundant ZFS RAIDZ2 on /amd64 9.2_STABLE) for
> disaster recovery.  Seemingly affordable LTO-5 drives (~EUR 250; sans
> libraries) pop up on eBay from time to time, and I thought I might
> start mailing tape backups to friends and family.  Being rather
> clueless about tape, I was wondering:

I used to a be a tape fan, because in large volumes it was cheaper and
because people who said disk was better than tape (20 years) ago did not
have good answers for "do you send a copy of your bits offsite every
other week at least?" and some did not have good answers for "do you
have copies of your bits that are locally offline?".

All that said, I don't think tape is a bad idea even now.  But I think
it's the path less traveled.

<from memory, take with grain of salt>

I used to run amanda, which runs various backup programs, typically dump
on BSD and tar on Linux (where the culture is that dump doesn' work, and
that might be correct).   amanda then splits the dumps into chunks of a
configured size, typically 1GB back then, and then puts all the chunks
onto tapes, perhaps multiple tapes.

In amanda culture, one basically does "dump | gzip" on the host being
backed up and then that gets an amanda header and split.

I started out with DDS2, then DDS3, then LTO-1 and I think LTO-2 (200 GB
native without compression?)   This was for backing up maybe 20
machines.    There were two sets of tapes, a daily set that got fulls
and incremenentals as amanda chose (to fit in a tape and meet frequency
goals for fulls while getting at least an incremental every run), and
then an archive set that had runs once a week, fulls only.   For daily,
tapes were in the drive, and for archive, dumps were manually flushed
(insert, run amflush, not that manual) on Mondays after weekend dumps.
Archive takes were sent offsite and the next one that would need to be
written retrieved.

This setup gave us the ability to immediately restore recent data from
onsite tapes in the even of an rm screwup or a disk failure.  And it
gave the ability to not lose the bits in the event of a total building
loss.

Probably amanda can use "zfs send" as a dump program, or could be taugth
to do so pretty easily.
</>

These days, disks are pretty cheap, but they probably cost more to mail,
and mailing them often isn't really good for them.  Think about the cost
of a 4T external USB drive vs tapes that hold 4T.

Also consider why you are doing backups.  Most people should be doing
backups for multiple reasons:

  - accidental rm
  - scrambled fs due to OS bug, perhaps provoked by power failure
  - scrambled fs due to hardware issues (I've seen bad RAM lead to
    corrupted files)
  - disk failure
  - whole computer failure due to lightning, power surge, etc.

  - cyber attack (ransomware or just generic compromise)

  - loss of entire building, or more precisely loss of primary disk and
    local backups at the same time


Generally backups should be offline, in that once made they are powered
off and disconnected.  This helps with the first and second groups.  And
some should be someplace else, to address the third.

Another strategy is:

  use disks (multiple) and write to them periodically.  store locally,
  unplugged.

  use another set of disks and write to them periodically but take them
  offsite (and hence unplugged)

  have disks at remote sites, e.g. plugged into a RPI or apu2, and
  additionally do backups to those disks over the net

This gives various protections at varying time scales; it's entirely
feasible to push backups over the net once a week and take physical
disks offsite somewhat less often.

Once you start thinking like this, you probably want to look at backup
systems that are able to do deduplication so that you can do backups in
a way where

  - every backup is logically a full backup
  - every backup only writes data that is not already in the archive

This helps with disk space, with being able to get state from long ago
(when you realize something went wrong long after the fault), and also
greatly reduces the data transfer requirements (once you have done the
first backup).

I am using bup for this, and others use borgbackup.  Surely there are
others.

Greg

Attachment: signature.asc
Description: PGP signature



Home | Main Index | Thread Index | Old Index