Nikolaus Rath's Website

BTRFS Reliability - a datapoint

A little while ago I blogged about SSD caching under Linux and promised to report back should I encounter any problems with the (rather complex) stack of btrfs on dm-crypt on lvm on bcache. I have now run this setup for several months and indeed encountered a few issues.

The first issue is that attempting to read from a freshly created file sometimes results in I/O errors that persist until the system is rebooted. With hindsight, I expect that re-mounting would also help but that didn't occur to me at the time so I did not try that. The same problem has been reported by several other people using a variety of configurations (thread on linux-btrfs), so I believe this is not specific to my storage stack but happens even when running btrfs directly on a disk (i.e., with no lvm, dm-crypt and bcache in-between).

The second problem I have encountered only twice in about 6 months. In this case, the kernel crashes with a BUG message from deep within the btrfs code. I am not sufficiently familiar with the Linux kernel to say with certainty what component is at fault nor can I narrow down which circumstances trigger the problem. However, based on just the file and function names in the kernel stack trace I do not think that the problem is caused by any of the lower layers here either - it's

I am running the latest kernel from jessie-backports (which typically trails the most recent mainline version by at most a few weeks) so I'm pretty sure that (as of today) neither of these bugs has been fixed.

I have concluded that btrfs is not yet stable enough for my purposes, and have therefore migrated my system to use ext4 instead of btrfs for most mountpoints. The only exceptions are sbuild chroots (which can easily be re-generated). My backup disks are still formatted with btrfs, but I intend to migrate them to ZFS shortly. The reason for choosing ZFS in one case and ext4 in the other is that ZFS offers snapshots, de-duplication, checksumming and compression (which are especially nice to have for backups), but that it is also an out-of-tree kernel module that I've never used before (so I don't want my production system to rely on it).

Linux

Comments