Incremental backups with Btrfs snapshots

Posted by Alessio Ciregia on September 14, 2020

Snapshots are an interesting feature of Btrfs. A snapshot is a copy of a subvolume. Taking a snapshot is immediate. However, taking a snapshot is not like performing a rsync or a cp, and a snapshot doesn’t occupy space as soon as it is created.

Editors note: From the BTRFS Wiki – A snapshot is simply a subvolume that shares its data (and metadata) with some other subvolume, using Btrfs’s COW capabilities.

Occupied space will increase alongside the data changes in the original subvolume or in the snapshot itself, if it is writeable. Added/modified files, and deleted files in the subvolume still reside in the snapshots. This is a convenient way to perform backups.

Using snapshots for backups

A snapshot resides on the same disk where the subvolume is located. You can browse it like a regular directory and recover a copy of a file as it was when the snapshot was performed. By the way, a snapshot on the same disk of the snapshotted subvolume is not an ideal backup strategy: if the hard disk broke, snapshots will be lost as well. An interesting feature of snapshots is the ability to send them to another location. The snapshot can be sent to an external hard drive or to a remote system via SSH (the destination filesystems need to be formatted as Btrfs as well). To do this, the commands btrfs send and btrfs receive are used.

Taking a snapshot

In order to use the send and the receive commands, it is important to create the snapshot as read-only, and snapshots are writeable by default.

The following command will take a snapshot of the /home subvolume. Note the -r flag for readonly.

sudo btrfs subvolume snapshot -r /home /.snapshots/home-day1

Instead of day1, the snapshot name can be the current date, like home-$(date +%Y%m%d). Snapshots look like regular subdirectories. You can place them wherever you like. The directory /.snapshots could be a good choice to keep them neat and to avoid confusion.

Editors note: Snapshots will not take recursive snapshots of themselves. If you create a snapshot of a subvolume, every subvolume or snapshot that the subvolume contains is mapped to an empty directory of the same name inside the snapshot.

Backup using btrfs send

In this example the destination Btrfs volume in the USB drive is mounted as /run/media/user/mydisk/bk . The command to send the snapshot to the destination is:

sudo btrfs send /.snapshots/home-day1 | sudo btrfs receive /run/media/user/mydisk/bk

This is called initial bootstrapping, and it corresponds to a full backup. This task will take some time, depending on the size of the /home directory. Obviously, subsequent incremental sends will take a shorter time.

Incremental backup

Another useful feature of snapshots is the ability to perform the send task in an incremental way. Let’s take another snapshot.

sudo btrfs subvolume snapshot -r /home /.snapshots/home-day2

In order to perform the send task incrementally, you need to specify the previous snapshot as a base and this snapshot has to exist in the source and in the destination. Please note the -p option.

sudo btrfs send -p /.snapshot/home-day1 /.snapshot/home-day2 | sudo btrfs receive /run/media/user/mydisk/bk

And again (the day after):

sudo btrfs subvolume snapshot -r /home /.snapshots/home-day3

sudo btrfs send -p /.snapshot/home-day2 /.snapshot/home-day3 | sudo btrfs receive /run/media/user/mydisk/bk

Cleanup

Once the operation is complete, you can keep the snapshot. But if you perform these operations on a daily basis, you could end up with a lot of them. This could lead to confusion and potentially a lot of used space on your disks. So it is a good advice to delete some snapshots if you think you don’t need them anymore.

Keep in mind that in order to perform an incremental send you need at least the last snapshot. This snapshot must be present in the source and in the destination.

sudo btrfs subvolume delete /.snapshot/home-day1

sudo btrfs subvolume delete /.snapshot/home-day2

sudo btrfs subvolume delete /run/media/user/mydisk/bk/home-day1

sudo btrfs subvolume delete /run/media/user/mydisk/bk/home-day2

Note: the day 3 snapshot was preserved in the source and in the destination. In this way, tomorrow (day 4), you can perform a new incremental btrfs send.

As some final advice, if the USB drive has a bunch of space, you could consider maintaining multiple snapshots in the destination, while in the source disk you would keep only the last one.

Alessio Ciregia

Alessio is an unpretentious sysadmin. Linux and FOSS are not his hobby... they are a job.

37 Comments

Brad Smith

Thanks for the information. Very useful and much appreciated. I am a long time Fedora user but not really a linux sysadmin. I have been using ext3 and ext4 along with LVM for quite a while. It would be nice if there were an article with links to information on safely migrating from EXT4/LVM (and an LVM raid) for /home) to btrfs for Fedora users.

thanks!

September 14, 2020
- Sebastiaan Franken
  
  As far as I know the only “safe” way of migrating from ext4 + LLVM to BTRFS is to reinstall your OS, since changing your underlying filesystem (and it’s layout) is quite a big deal, plumbing wise. There is no safe way of doing that with a running system, as far as I know.
  
  September 14, 2020
  - Horniger Glücksbärchi
    
    Actually you would be surprised.
    It doesnt really matter which filesystem the files are on as long as you dont change the mount tree.
    
    Exceptions:
    If you forget to copy extended attributes (selinux labels) then you need to relabel.
    If you forget to copy with preserve then you lose file permissions and dates…
    But this whiles it uses filesystem features, doesnt depend on a particular filesystem.
    So any fs that has the feature can contain your system files + metadata.
    
    Notes:
    Bringup of logical devices is managed by the kernel cmdline. To port plain to lvm you need to bringup lvm there if it doesnt autodetect your vgs.
    Porting to luks encryption, same deal.
    All that stuff is autogenerated by bootloader rebuild if you have fstab and crypttab populated.
    
    BTRFS Snapshots trees arent directly (integrated to some preinstalled copying solution) portable to other schemes.
    So copy btrfs tree to plain ext4 volume -> you get the same layout as in btrfs without subvolumes.
    Conversion of btrfs tree to linked lvm volumes, is possible but I dont know a tool that does it.
    
    September 27, 2020
- pctux982
  
  Full backup of entire system is possible using rsync, taking care of preserving extended attributes.
  Once you have performed a backup, create btrfs partition and eventually subvolumes within, mount it and redo rsync of backup on new system root.
  Once terminated, chroot into new root and regenerate initramfs, therefore reconfigure grub, and you’re done.
  Check arch wiki for information about full system backup
  
  September 15, 2020
- Andrew
  
  I’ve used these steps to migrate systems:
  1: Research what changes you’ll need to make to the fstab file, kernel command line, boot loader, and/or initramfs.
  2: Install any tools needed for the new file system. Btrfs tools in this case.
  3: Boot from an SD card.
  4: Dd the whole hard drive to a file on an external drive.
  5: If the drive is encrypted, decrypt it and dd the decrypted hard drive to another file on the external drive.
  6: Reformat the drive.
  7: Mount your decrypted dd image via a loopback device.
  8: Rsync the dd filesystem to the drive.
  9: Apply the changes you identified in step 1.
  
  If you get stuck, you can dd the image from step 4 back to the hard drive.
  
  September 16, 2020
Nick Avem

Will the new F33 LVM layout be (preferably out of the box) compatible with snapper?

September 14, 2020
- Kyle
  
  Second snapper.io to help mange the snapshots. Very handy.
  
  September 15, 2020
Vernon Van Steenkist

rdiff-backup has had this capability for over 20 years and works great on many different file systems and even remote machines. What is the advantage of using btrfs?

September 14, 2020
- Chris Murphy
  
  Perhaps the most significant difference is how difference is computed. Whether cp, rsync, or rdiff-backup, both the source and the destination need to be scanned and compared to know how they differ and what to copy.
  
  Btrfs snapshots, the difference is a function of copy-on-write, and what changes have happened between two “generations”. Each snapshot has a unique generation. Deep traversal isn’t required. For example, I have a 1TB subvolume, and 1M of difference between two of its subvolume snapshots. The incremental send takes only a few seconds. Also, the difference is a function of changed blocks, and btrfs send will only send changed blocks (depending on how the owning application updates files).
  
  Send+receive is well suited for backup, but it’s primarily a replication scheme. So you don’t see options for filtering (excludes or includes), or whether to preserve dates, times, permissions, or labels. These things are always preserved. A related feature is the possibility of creating a file out of the send stream, whether it’s a full or incremental send. This can come in handy for replicating containers or even a full file system root to many machines, physical or virtual.
  
  Like anything, there are tradeoffs. But no matter what, backups are better than no backups!
  
  September 15, 2020
  - Vernon Van Steenkist
    
    “Whether cp, rsync, or rdiff-backup, both the source and the destination need to be scanned and compared to know how they differ and what to copy.”
    
    No. rdiff-backup is not like cp or rsync. Unlike cp and rsync, rdiff-backup has its own meta-data directory where it stores sha1 checksums on every file. rdiff-backup only needs to compare these checksums to see if an incremental delta backup needs to occur which makes rdiff-backup extremely fast. More information is below
    
    https://current.workingdirectory.net/posts/2018/rsyncvsrdiff/
    
    In addition, unlike cp and rsync, rdiff-backup creates delta snapshots each time it runs where only the deltas ares stored like a version manager ala CVS.
    
    “Send+receive is well suited for backup, but it’s primarily a replication scheme. So you don’t see options for filtering (excludes or includes), or whether to preserve dates, times, permissions, or labels.”
    
    No, rdiff-backup does have filtering and does preserve dates, times, permissions etc.
    
    More information is below:
    
    https://linux.die.net/man/1/rdiff-backup
    
    September 15, 2020
    - satai
      
      It still needs to go through files, count SHAs and so on.
      
      Snapshots don’t do it, their existence is just a side effect of CoW.
      
      September 15, 2020
      - Vernon Van Steenkist
        
        “It still needs to go through files, count SHAs and so on. Snapshots don’t do it, their existence is just a side effect of CoW.”
        
        Yes. Which is why it is non-trivial to restore a deleted file with btrfs if you don’t know when (how many snapshots ago) it was last on the disk. You can think of rdiff-backup as a version control system for your files whereas btrfs snapshots are like dd with a delta feature.
        
        I certainly agree that btrfs can create snapshots quicker than rdiff-backup can create snapshots. However, the trade-offs are backup and restore inflexibility (doesn’t natively support different filesystems, network backups, file backup history), higher cpu loads, slower filesystem performance and filesystem instability (as opposed to ext4) making this article prescient for btrfs users who should take backup snapshots often 🙂
        
        September 16, 2020
    - Chris Murphy
      
      rdiff-backup doesn’t have to read every file on the target, but it does have to read every file on the source. It computes a new sha1sum for each file, comparing to its own metadata directory, to know what’s changed.
      
      Btrfs send doesn’t need to read a single file, source or target, to know what’s changed, and what to send. It doesn’t even have to read all of the metadata for the snapshots being sent, because the nodes contain a generation value for each referenced leaf. If a leaf is too old or new (compared to the generations of the two snapshots) it doesn’t need to be read.
      
      September 15, 2020
- Sergey
  
  I join in the question. If you’re looking at the future of SilverBlue with its update system, what’s the point of the subject then?
  
  September 15, 2020
- Robert Redziak
  
  Main difference is that you can, for example, quiesce/freeze and sync do to disks your database or application, take a snapshot, bring back your app to normal work and take backup your data from snapshot in consistent state. Which is not possible with tools, which work on file level.
  
  September 15, 2020
Duncan Ball

Nice article, but of ou are going to write an something describing a backup mechanism, it would be helpful to provide at least one example of how to recover from that backup Is there an efficient way to apply just the delta from the snapshot copy to roll back the filesystem to that point in time?

September 14, 2020
- Thomas Klein
  
  Definitely – I second the request for a brief “how-to” regarding restoring the backed-up dtat.
  If possible, could you elaborate on two cases please: “restoring a running system to a specific point in time / snapshot” and “restoring everything in a desaster recovery scenario” ?
  
  While the above may sound a bit demanding, be assured that your article is MUCH appreciated! Thanks a ton, this adds a new twist to the btrfs discussions 😉
  
  September 16, 2020
laolux

Thanks for the article, will use that once f33 is out.
Now it would be great if you could elaborate on the ssh option a bit. How would I do that? And do I need to be root on the receiving machine?

September 15, 2020
- Juan Orti
  
  Yes, you need root in both the source and target machine.
  
  September 16, 2020
  - Alex Corf
    
    How about the backup size? The file compressed on destination or left as is?
    
    September 17, 2020
    - Juan Orti
      
      If the source is using compression, the sender decompress the data and sends it to the receiver that will re-compress or no depending of the target compress options.
      
      There’s ongoing work to optimize this process and allow to send the compressed stream directly, but that’s still in development.
      
      September 17, 2020
- Chris Murphy
  
  I’m using ssh pki, so only keys. On the remote, I create /etc/sudoers.d/1chris containing:
  chris ALL = NOPASSWD: /usr/sbin/btrfs
  
  Then from the local computer:
  sudo btrfs send -p gits.20200830 gits.20200905 | ssh chris@alarmpi.local “sudo btrfs receive .”
  
  September 18, 2020
svsv sarma

I think backup and snapshot are entirely different. While the former refers to documents, the later refers to the entire system including virus if any. I prefer the former to the later in a different devise. But I do only manual backup as it is easy for me to install fedora any time and start the documents from the backup. Perhaps this btrfs article is for developers and software engineers. Thanks for a very nice article prompting a debate.

September 15, 2020
bbrot

Are there any cloud storage providers that support btrfs send/receive via SSH for backups? That would be super helpful!

September 15, 2020
Lucas

Nice article. Snapshots in btrfs have already saved my workday.

September 15, 2020
Shy

Blivet 2.2.0-1 supports creating Timeshift-compatible btrfs volume labels (@ for the rootfs subvol and @home for the /home subvol. That version of blivet was submitted in bodhi for testing on fc33 Feb 12, 2020. See BZ 1859963 for the bug report with the bodhi link.

I’m not sure if it will make it onto the distribution media for Fedora 33 or not. I don’t think that those btrfs volume labels can be easily applied after installation.

Timeshift is a GUI that can take btrfs snapshots.

September 15, 2020
Nik

I’m using btrbk which automates this in a really nice way for quite a while on Arch. When switching to Fedora later this year i think i’ll stay with btrbk (or snapper).

September 15, 2020
Nik

btrbk – https://github.com/digint/btrbk

September 15, 2020
Juan Orti

I personally use the btrbk utility that it’s very handy to manage to all my snapshots and send-receive backups. I recommend it.

September 16, 2020
Lockheed

Thanks for this article. Never had much to do with the file system things aside the real basics to go along as a simple user. I really love examples and how this is introduced and – as I am going to have a NAS finally soon finally rather than an external drive – I look forward to utilize. Will need a bit practice though but I hope I’ll do fine having then a real backup solution. Thanks again a lot for warming up users for btrfs! Very welcomed!

September 18, 2020
Folkert M.

It may not be the right place, but the right time to post “Farewell Fedora and take care of yourself”. Our friendship ends here after 11 years. I’m still using 86ed hardware and I’m not convinced by the current x64 hardware offerings, either too expensive, as in the case of Apple, or the hardware feels useless even when playing. I am now friends with Debian.

September 19, 2020
Michael

Great article, can’t wait to try it out.

September 19, 2020
leslie Satenstein

In brfs terms, what is a volume and what is a sub-volume.
In my system, I have separate partitions for boot/efi,
/boot, /, /var /home and swap.
Leaving out swap, and /boot/efi, do I create a single partition for /, /var, /boot, /home and on boot of the new system, they become sub-volumes?
Do I just change type=ext4 to type=btrfs, when declaring the partitions for the /, /var, /boot and /home?

If I think about volumes and sub-volumes as an encyclopaedia,
then the individual books are sub-volumes, and they are individually bound items. Drawing from encyclopaedia to Linux partitioning, am I wrong to think of a sub-volume as a partition?

September 20, 2020
- Andrew Holden
  
  Btrfs subvolumes are somewhere between directories and partitions. Many btrfs systems would have 2 or 3 partitions:
  1: /boot, probably fat32.
  2: swap (optional, or you can make it a file in btrfs is many circumstances–read the docs if you want to try).
  3: The btrfs volume. More about mounting it below.
  
  If you have multiple drives, then you can make a single partition on the remaining drives and list them all along with partition #3 above then you run mkfs.btrfs. This will group all the partitions into a single volume. By default, btrfs will mirror metadata and stripe data.
  
  Now, it’s time to plan subvolumes. There are several schemes to choose from:
  * Ignore subvolumes and just mount the volume as /.
  * Mount the volume as /, but make a subvolume for any folders that you want to snapshot independently. Just name the subvolumes after their path, and they’ll automatically mount. For example, a subvolume named “home” would contain the home folder, or a subvolume named “var/log” would contain the log folder.
  * Full manual. Make subvolumes for root, home, and whatever else you want and list them all in fstab.
  
  Don’t try to format the subvolumes. They’re not block devices.
  
  September 23, 2020
Magnus Asbjørn

I found an old comment linking to btrfs’ bugtracker, compared to other filesystems it seems to have way more open issues. Understandable since they have so many features. But some of the reports are years old like corruption, with no reply.
Any comments about that?

September 22, 2020
- Per F
  
  I am also concerned about this. Maybe it has gotten better and more stable over the years, but btrfs still have that bad reputation of loosing entire file systems to corruption.
  What I don’t get is why Fedora is now making btrfs the default file system when RedHat officially dumped all future support of btrfs back in 2017?
  Well for me it’s ok to have btrfs as an available option (for the adventurous ones), but as the default fs?? I don’t think so!
  
  September 29, 2020
Mike G

Honestly, what is the point of btrfs? I’ve been using zfs for years on mac, linux, and BSD. Aside from getting around some kind of nebulous legal problem, is anything actually better about btr? If the intellectual property laws force people to spend years reinventing the wheel, maybe things were better before we had all of this software licensing nonsense.

October 7, 2020