Btrfs: Modern Copy-on-Write Filesystem

Learn Btrfs with built-in snapshots, RAID, and compression. Explore copy-on-write, subvolumes, and self-healing on Linux.

Best viewed on desktop for optimal interactive experience

Btrfs: Where Your Data Gets Superpowers

Imagine a filesystem that could travel back in time. One that never loses data, even when you accidentally delete something. A filesystem that can detect and fix corruption before you even know it's there. Welcome to Btrfs—where science fiction meets your storage!

Btrfs (B-tree filesystem, pronounced "Butter FS" or "Better FS") isn't just another filesystem—it's a complete rethinking of how we store data. Born at Oracle in 2007 and now community-driven, Btrfs brings enterprise-grade features to everyone.

Think of Btrfs as Linux's Swiss Army knife for storage. While ext4 is your reliable daily driver, Btrfs is the transformer that can morph into whatever you need: a snapshot machine, a RAID array, a compression engine, or all of the above simultaneously!

Copy-on-Write: The Magic Behind Everything

The Revolution: Traditional filesystems are like writing with a pen—once you overwrite something, it's gone forever. Btrfs is like having an infinite stack of transparent sheets. Every change creates a new layer, and you can always peek back at previous versions.

How CoW Actually Works

When you modify a file on a traditional filesystem (ext4, NTFS), the system overwrites the existing data blocks directly. If power fails mid-write, you get corruption. Btrfs takes a fundamentally different approach:

  1. Never overwrite existing data — modifications go to new, free blocks
  2. Update pointers atomically — the metadata tree points to new blocks only after the write completes
  3. Old blocks remain intact — they're either freed or kept for snapshots

Toggle below to see this in action. Notice how block B stays untouched—Btrfs writes the modified version to a completely new location (B'):

A
current
B
current
C
current
D
current

What you're seeing: The original file has 4 blocks (A, B, C, D). When B is modified, Btrfs doesn't touch the original B—it writes B' to free space. The current file now points to A, B', C, D while snapshots still reference the original B. This is why snapshots are instant: no data copying, just pointer manipulation.

Why This Changes Everything

Copy-on-Write isn't just a feature—it's the foundation that enables:

  • Instant snapshots — Creating a snapshot just copies the metadata tree (a few KB), not the actual data (potentially TB). The snapshot and live filesystem share all unchanged blocks.
  • Atomic transactions — Either the entire write succeeds, or nothing changes. Power loss mid-write? The old data is still there, untouched.
  • Data integrity — Checksums are stored separately from data. Btrfs can detect (and with RAID, repair) silent data corruption.
  • Time travel — Since old blocks are preserved in snapshots, you can access any previous version instantly—no restore needed.

Subvolumes: Filesystems Within a Filesystem

Subvolumes are Btrfs's killer organizational feature. Think of them as independent directories that can be snapshotted, mounted, and managed separately—without the overhead of creating actual partitions.

Subvolume Hierarchy

Subvolumes are independent filesystem trees within Btrfs. Each can be mounted separately with different options—perfect for separating system, home, and snapshots.

/dev/sda1 (Btrfs)
@ID: 5
/
@homeID: 256
/home
@snapshotsID: 257
/.snapshots
@varID: 258
/var
NoCoW
Subvolume Details

@

Root subvolume

Properties
Subvolume ID5
Path/
Mount Point/
Mount Options
compress=zstdnoatime
Mount Command
mount -o subvol=@,compress=zstd,noatime /dev/sda1 /
Why Subvolumes Matter

Subvolumes let you snapshot @ (root) without including @home—so system rollbacks don't affect your personal files. This layout is used by openSUSE, Fedora Silverblue, and many NixOS setups.

Why Subvolumes Matter

Unlike partitions (which require repartitioning to resize), subvolumes:

  • Share the same storage pool — No wasted space from over-provisioned partitions
  • Can be snapshotted independently — Snapshot /home without /var/log
  • Support different mount options — Compress home directories but not databases
  • Enable atomic system rollbacks — Distros like openSUSE and Fedora use this
# Create subvolume sudo btrfs subvolume create /mnt/data/projects # List subvolumes sudo btrfs subvolume list /mnt # Subvolumes can be mounted independently sudo mount -o subvol=projects /dev/sda1 /mnt/projects # Each subvolume can have different mount options sudo mount -o subvol=databases,nodatasum /dev/sda1 /mnt/db

For a typical desktop or server, consider this layout:

SubvolumeMount PointPurpose
@/Root filesystem
@home/homeUser data
@snapshots/.snapshotsSnapshot storage
@var_log/var/logLogs (exclude from snapshots)
@docker/var/lib/dockerContainer storage (nodatacow)

Snapshots: Time Travel for Your Data

Snapshots are instant, space-efficient copies of subvolumes. Because of Copy-on-Write, creating a snapshot is nearly instantaneous—it just copies the metadata pointers, not the actual data.

Snapshot Timeline

Watch how Btrfs snapshots share unchanged blocks through Copy-on-Write. Snapshots are instant because they only copy metadata pointers—not actual data.

Current State
Live filesystem with 6 data blocks
6 blocks
Total disk space
Live Filesystem
A
B
C
D
E
F
Unique
Shared
Modified
Snapshot-only
Space Efficiency Through Block Sharing

After creating a snapshot, no new disk space is used until you modify files. Each modification only costs the space of the changed blocks—not the entire file. A 100GB filesystem with 10 snapshots might only use 110GB if only 10% of data changed.

Snapshot Magic Explained

When you create a snapshot:

  1. Btrfs copies only the metadata tree (pointers to blocks)
  2. Both snapshot and original share all data blocks
  3. As either changes, only modified blocks are duplicated
  4. Old blocks are preserved until all snapshots referencing them are deleted
# Create snapshot sudo btrfs subvolume snapshot /mnt/data /mnt/snapshots/data-$(date +%Y%m%d) # Create read-only snapshot (for backups) sudo btrfs subvolume snapshot -r /mnt/data /mnt/snapshots/data-backup # List snapshots sudo btrfs subvolume list -s /mnt # Rollback to snapshot sudo btrfs subvolume delete /mnt/data sudo btrfs subvolume snapshot /mnt/snapshots/data-backup /mnt/data

Automated Snapshots with Snapper

For production use, automate snapshots with Snapper or btrbk:

# Install snapper sudo apt install snapper # Debian/Ubuntu sudo dnf install snapper # Fedora # Configure for root sudo snapper -c root create-config / # List snapshots sudo snapper list # Compare snapshots sudo snapper diff 1..2 # Rollback to snapshot sudo snapper undochange 1..0

Built-in RAID: Redundancy Without mdadm

Btrfs includes native RAID support, meaning you can span multiple drives without external tools. It handles data and metadata redundancy separately—you can have RAID 1 for metadata (safety) but RAID 0 for data (speed).

Btrfs RAID Configurations

Btrfs has built-in RAID support—no mdadm needed. Select a configuration to see how data is distributed, then simulate a disk failure.

RAID 1

Mirroring—duplicate data on all devices

Min Disks
2
Performance
2x read, 1x write
Fault Tolerance
1 disk
Usable Space
50%
Disk 1
A1
A2
B1
B2
C1
C2
Disk 2(mirror)
A1
A2
B1
B2
C1
C2
Create RAID 1 filesystem:
mkfs.btrfs -d raid1 -m raid1 /dev/sda1 /dev/sdb1
RAID 5/6 Warning

Btrfs RAID 5 and RAID 6 have a "write hole" bug and are not production-ready. For parity RAID, use ZFS or mdadm + Btrfs on top. Stick to RAID 1/10 for Btrfs-native redundancy.

Creating Multi-Device Filesystems

# Create RAID 1 (mirrored) filesystem sudo mkfs.btrfs -d raid1 -m raid1 /dev/sdb /dev/sdc # Create RAID 10 (striped mirrors) sudo mkfs.btrfs -d raid10 -m raid10 /dev/sdb /dev/sdc /dev/sdd /dev/sde # Add device to existing filesystem sudo btrfs device add /dev/sdd /mnt # Convert single device to RAID 1 sudo btrfs balance start -dconvert=raid1 -mconvert=raid1 /mnt # Remove failed device sudo btrfs device remove /dev/sdc /mnt # Replace failed device sudo btrfs replace start /dev/sdc /dev/sdd /mnt

Transparent Compression: More Space, Often Faster

Btrfs supports transparent compression—files are compressed on write and decompressed on read, completely invisible to applications. Surprisingly, compression often improves performance by reducing disk I/O.

Compression Comparison

Btrfs supports transparent compression. Compare algorithms to find the best balance of speed and space savings for your workload.

File Type:
Source code, logs, configs
None
1000 KB →1000 KB
100% of original
100%
LZO
1000 KB →450 KB-55%
45% of original
95%
Zlib
1000 KB →350 KB-65%
35% of original
40%
Zstd:1
1000 KB →380 KB-62%
38% of original
90%
Zstd:3
1000 KB →320 KB-68%
32% of original
75%
Zstd:9
1000 KB →280 KB-72%
28% of original
35%
Zstd:15
1000 KB →260 KB-74%
26% of original
15%
Zstd:3
Default, great balance
68%
Space Saved
75%
Speed
3.1x
Ratio
Mount Command
mount -o compress=zstd:3 /dev/sda1 /mnt
Recommendation: zstd:3

Zstd level 3 is the sweet spot for most workloads—fast enough that you won't notice it, with compression ratios rivaling zlib. For SSDs, compression can actually improve performance by reducing write amplification. Skip compression for already-compressed files (images, videos, archives).

Compression Algorithms

AlgorithmSpeedRatioBest For
zstd (default)FastGoodGeneral use, recommended
lzoFastestLowerReal-time workloads
zlibSlowBestArchival, cold storage
# Mount with compression sudo mount -o compress=zstd:3 /dev/sdb1 /mnt # Force compression (even for incompressible files) sudo mount -o compress-force=zstd /dev/sdb1 /mnt # Check compression ratio sudo compsize /mnt # Enable compression on existing data (requires balance) sudo btrfs filesystem defragment -czstd -r /mnt/data

When NOT to Compress

Disable compression for already-compressed or random-access data:

# Disable CoW and compression for VMs/databases sudo chattr +C /var/lib/libvirt/images/ sudo chattr +C /var/lib/mysql/

Data Integrity: Checksums and Self-Healing

Unlike ext4 or XFS, Btrfs checksums every block of data and metadata. This means it can detect "bit rot"—silent corruption from hardware errors that other filesystems miss entirely.

Btrfs Scrub & Data Integrity

Btrfs checksums every data block. The btrfs scrub command verifies all checksums, detects silent corruption, and repairs data from RAID mirrors if available.

Progress: 0/12 blocksReady
Verified
0
Errors
0
Repaired
0
Data Loss
None
#0
config.txt
CRC:61F173
#1
data.db
CRC:61F173
#2
photo.jpg
CRC:61F173
#3
notes.md
CRC:CORRUPT
#4
app.bin
CRC:61F173
#5
log.txt
CRC:61F173
#6
config.txt
CRC:61F173
#7
data.db
CRC:CORRUPT
#8
photo.jpg
CRC:61F173
#9
notes.md
CRC:61F173
#10
app.bin
CRC:61F173
#11
log.txt
CRC:61F173
Run a scrub operation:
# Start scrub on mounted filesystem
sudo btrfs scrub start /mnt/data

# Check scrub status
sudo btrfs scrub status /mnt/data

# View detailed stats
sudo btrfs device stats /mnt/data
Silent Corruption Protection

Unlike ext4/XFS, Btrfs uses CRC32C checksums for both data and metadata. This detects "bit rot"—silent corruption from hardware errors that traditional filesystems miss. Schedule monthly scrubs with systemd timers for proactive data protection.

The Power of Scrubbing

The btrfs scrub command reads all data, verifies checksums, and (with RAID) repairs corruption automatically:

# Start scrub (runs in background) sudo btrfs scrub start /mnt # Check scrub status sudo btrfs scrub status /mnt # View device error statistics sudo btrfs device stats /mnt # Schedule monthly scrubs with systemd sudo systemctl enable btrfs-scrub@-.timer # For root filesystem

Checksum Protection

Btrfs uses CRC32C checksums (hardware-accelerated on modern CPUs):

  • Data blocks: Detect silent corruption
  • Metadata: Protect directory structures
  • Parent pointers: Verify tree integrity

With RAID, Btrfs can automatically repair corruption by copying good data from mirrors.

Creating and Managing Btrfs

Creating a Btrfs Filesystem

# Single device sudo mkfs.btrfs /dev/sdb1 # With label sudo mkfs.btrfs -L "DataDrive" /dev/sdb1 # Multiple devices (RAID 1) sudo mkfs.btrfs -d raid1 -m raid1 /dev/sdb1 /dev/sdc1 # For SSD sudo mkfs.btrfs -O discard /dev/nvme0n1p1

Essential Mount Options

# Recommended for SSD sudo mount -o ssd,discard=async,compress=zstd,noatime,space_cache=v2 /dev/sdb1 /mnt # Recommended for HDD sudo mount -o compress=zstd,autodefrag,noatime,space_cache=v2 /dev/sdb1 /mnt # Add to /etc/fstab UUID=xxx /mnt btrfs defaults,compress=zstd,noatime,space_cache=v2 0 0

Maintenance Commands

# Check space usage (more accurate than df) sudo btrfs filesystem usage /mnt # Show device allocation sudo btrfs device usage /mnt # Balance (redistribute data across devices) sudo btrfs balance start /mnt # Balance with filters (metadata only) sudo btrfs balance start -musage=50 /mnt # Defragment (breaks CoW sharing!) sudo btrfs filesystem defragment -czstd -r /mnt/data

Common Pitfalls and Solutions

1. "No Space Left" with Plenty of Free Space

Problem: Btrfs shows space available but refuses writes.

Cause: Metadata is full while data has space (or vice versa).

Solution:

# Check what's actually full sudo btrfs filesystem usage /mnt # Balance metadata (usually the culprit) sudo btrfs balance start -musage=50 /mnt # If desperate, add a small device temporarily sudo btrfs device add /dev/ram0 /mnt sudo btrfs balance start /mnt sudo btrfs device remove /dev/ram0 /mnt

2. RAID 5/6 Data Loss

Problem: Using RAID 5 or 6 leads to data corruption.

Cause: Known "write hole" bug remains unfixed.

Solution: Use RAID 1 or RAID 10 with Btrfs. For parity RAID, use mdadm + Btrfs on top, or switch to ZFS.

3. Slow Performance with Databases/VMs

Problem: Database or VM performance is terrible.

Cause: Copy-on-Write causes fragmentation and write amplification for random writes.

Solution:

# Disable CoW for the directory (must be empty first!) sudo mkdir /var/lib/mysql sudo chattr +C /var/lib/mysql # Or mount with nodatacow for specific subvolumes sudo mount -o subvol=@databases,nodatacow /dev/sdb1 /var/lib/mysql

4. Snapshot Space Explosion

Problem: Disk fills up even though files haven't grown.

Cause: Snapshots preserve old data—nothing is truly deleted.

Solution:

# List snapshots and their space usage sudo btrfs subvolume list -s /mnt # Delete old snapshots sudo btrfs subvolume delete /mnt/snapshots/old-snapshot # Use snapper's cleanup policies sudo snapper -c root set-config TIMELINE_LIMIT_HOURLY=5

5. Can't Delete Subvolume

Problem: btrfs subvolume delete fails with "directory not empty".

Cause: Nested subvolumes exist inside the target.

Solution:

# List nested subvolumes sudo btrfs subvolume list -o /mnt/target # Delete from deepest to shallowest sudo btrfs subvolume delete /mnt/target/nested/deep sudo btrfs subvolume delete /mnt/target/nested sudo btrfs subvolume delete /mnt/target

Performance Tuning

SSD Optimization

# Optimal mount options for NVMe/SSD defaults,ssd,discard=async,compress=zstd:1,noatime,space_cache=v2

HDD Optimization

# Optimal mount options for spinning disks defaults,compress=zstd:3,autodefrag,noatime,space_cache=v2

Database/VM Workloads

# Disable CoW for random-write heavy directories chattr +C /var/lib/postgresql/ chattr +C /var/lib/libvirt/images/ # Consider nodatasum for databases (they have their own checksums) mount -o subvol=@databases,nodatacow,nodatasum /dev/sdb1 /var/lib/mysql

Btrfs vs ZFS vs ext4

FeatureBtrfsZFSext4
Copy-on-Write
Snapshots
Compression
Checksums✗ (metadata only)
Built-in RAIDPartialFull
Linux mainline✗ (license)
RAM usageLowHigh (1GB+)Very Low
StabilityGoodExcellentExcellent
Repair toolsBasicExcellentExcellent

When to Use Btrfs

✅ Perfect for:

  • Desktop/laptop systems: Snapshots for easy recovery, compression saves space
  • Development environments: Quick rollback of system changes
  • Container hosts: Efficient storage drivers for Docker/Podman
  • Home NAS: Built-in RAID 1/10 with data integrity
  • Backup servers: Send/receive for efficient incremental replication

❌ Avoid for:

  • Databases with heavy writes: Use ext4/XFS with nodatacow, or ZFS
  • RAID 5/6 requirements: Use ZFS or mdadm instead
  • Maximum stability requirements: ext4 remains more battle-tested
  • Very large scale (100+ TB): ZFS handles extreme scale better
  • Copy-on-Write: The foundational mechanism explained in depth
  • Snapshots: How instant point-in-time copies work
  • Data Integrity: Checksums, scrubbing, and corruption detection
  • Mount Options: Performance tuning and security settings
  • Filesystems Overview: Compare Btrfs with other Linux filesystems
  • ZFS: Alternative CoW filesystem with different trade-offs
  • ext4: The traditional, reliable Linux filesystem
  • RAID Storage: Understanding redundancy levels and trade-offs

If you found this explanation helpful, consider sharing it with others.

Mastodon