Btrfs: Modern Copy-on-Write Filesystem

Btrfs: Where Your Data Gets Superpowers

Imagine a filesystem that could travel back in time. One that never loses data, even when you accidentally delete something. A filesystem that can detect and fix corruption before you even know it's there. Welcome to Btrfs—where science fiction meets your storage!

Btrfs (B-tree filesystem, pronounced "Butter FS" or "Better FS") isn't just another filesystem—it's a complete rethinking of how we store data. Born at Oracle in 2007 and now community-driven, Btrfs brings enterprise-grade features to everyone.

Think of Btrfs as Linux's Swiss Army knife for storage. While ext4 is your reliable daily driver, Btrfs is the transformer that can morph into whatever you need: a snapshot machine, a RAID array, a compression engine, or all of the above simultaneously!

Copy-on-Write: The Magic Behind Everything

The Revolution: Traditional filesystems are like writing with a pen—once you overwrite something, it's gone forever. Btrfs is like having an infinite stack of transparent sheets. Every change creates a new layer, and you can always peek back at previous versions.

How CoW Actually Works

When you modify a file on a traditional filesystem (ext4, NTFS), the system overwrites the existing data blocks directly. If power fails mid-write, you get corruption. Btrfs takes a fundamentally different approach:

Never overwrite existing data — modifications go to new, free blocks
Update pointers atomically — the metadata tree points to new blocks only after the write completes
Old blocks remain intact — they're either freed or kept for snapshots

Toggle below to see this in action. Notice how block B stays untouched—Btrfs writes the modified version to a completely new location (B'):

current

What you're seeing: The original file has 4 blocks (A, B, C, D). When B is modified, Btrfs doesn't touch the original B—it writes B' to free space. The current file now points to A, B', C, D while snapshots still reference the original B. This is why snapshots are instant: no data copying, just pointer manipulation.

Why This Changes Everything

Copy-on-Write isn't just a feature—it's the foundation that enables:

Instant snapshots — Creating a snapshot just copies the metadata tree (a few KB), not the actual data (potentially TB). The snapshot and live filesystem share all unchanged blocks.
Atomic transactions — Either the entire write succeeds, or nothing changes. Power loss mid-write? The old data is still there, untouched.
Data integrity — Checksums are stored separately from data. Btrfs can detect (and with RAID, repair) silent data corruption.
Time travel — Since old blocks are preserved in snapshots, you can access any previous version instantly—no restore needed.

Subvolumes: Filesystems Within a Filesystem

Subvolumes are Btrfs's killer organizational feature. Think of them as independent directories that can be snapshotted, mounted, and managed separately—without the overhead of creating actual partitions.

Subvolume Hierarchy

Subvolumes are independent filesystem trees within Btrfs. Each can be mounted separately with different options—perfect for separating system, home, and snapshots.

/dev/sda1 (Btrfs)

@ID: 5

→ /

@homeID: 256

→ /home

@snapshotsID: 257

→ /.snapshots

@varID: 258

→ /var

NoCoW

Subvolume Details

@

Root subvolume

Properties

Subvolume ID5

Path/

Mount Point/

Mount Options

compress=zstdnoatime

Mount Command

mount -o subvol=@,compress=zstd,noatime /dev/sda1 /

Why Subvolumes Matter

Subvolumes let you snapshot @ (root) without including @home—so system rollbacks don't affect your personal files. This layout is used by openSUSE, Fedora Silverblue, and many NixOS setups.

Why Subvolumes Matter

Unlike partitions (which require repartitioning to resize), subvolumes:

Share the same storage pool — No wasted space from over-provisioned partitions
Can be snapshotted independently — Snapshot /home without /var/log
Support different mount options — Compress home directories but not databases
Enable atomic system rollbacks — Distros like openSUSE and Fedora use this

# Create subvolume
sudo btrfs subvolume create /mnt/data/projects

# List subvolumes
sudo btrfs subvolume list /mnt

# Subvolumes can be mounted independently
sudo mount -o subvol=projects /dev/sda1 /mnt/projects

# Each subvolume can have different mount options
sudo mount -o subvol=databases,nodatasum /dev/sda1 /mnt/db

Recommended Subvolume Layout

For a typical desktop or server, consider this layout:

Subvolume	Mount Point	Purpose
`@`	`/`	Root filesystem
`@home`	`/home`	User data
`@snapshots`	`/.snapshots`	Snapshot storage
`@var_log`	`/var/log`	Logs (exclude from snapshots)
`@docker`	`/var/lib/docker`	Container storage (nodatacow)

Snapshots: Time Travel for Your Data

Snapshots are instant, space-efficient copies of subvolumes. Because of Copy-on-Write, creating a snapshot is nearly instantaneous—it just copies the metadata pointers, not the actual data.

Snapshot Timeline

Watch how Btrfs snapshots share unchanged blocks through Copy-on-Write. Snapshots are instant because they only copy metadata pointers—not actual data.

Current State

Live filesystem with 6 data blocks

6 blocks

Total disk space

Live Filesystem

Unique

Shared

Modified

Snapshot-only

Space Efficiency Through Block Sharing

After creating a snapshot, no new disk space is used until you modify files. Each modification only costs the space of the changed blocks—not the entire file. A 100GB filesystem with 10 snapshots might only use 110GB if only 10% of data changed.

Snapshot Magic Explained

When you create a snapshot:

Btrfs copies only the metadata tree (pointers to blocks)
Both snapshot and original share all data blocks
As either changes, only modified blocks are duplicated
Old blocks are preserved until all snapshots referencing them are deleted

# Create snapshot
sudo btrfs subvolume snapshot /mnt/data /mnt/snapshots/data-$(date +%Y%m%d)

# Create read-only snapshot (for backups)
sudo btrfs subvolume snapshot -r /mnt/data /mnt/snapshots/data-backup

# List snapshots
sudo btrfs subvolume list -s /mnt

# Rollback to snapshot
sudo btrfs subvolume delete /mnt/data
sudo btrfs subvolume snapshot /mnt/snapshots/data-backup /mnt/data

Automated Snapshots with Snapper

For production use, automate snapshots with Snapper or btrbk:

# Install snapper
sudo apt install snapper  # Debian/Ubuntu
sudo dnf install snapper  # Fedora

# Configure for root
sudo snapper -c root create-config /

# List snapshots
sudo snapper list

# Compare snapshots
sudo snapper diff 1..2

# Rollback to snapshot
sudo snapper undochange 1..0

Built-in RAID: Redundancy Without mdadm

Btrfs includes native RAID support, meaning you can span multiple drives without external tools. It handles data and metadata redundancy separately—you can have RAID 1 for metadata (safety) but RAID 0 for data (speed).

Btrfs RAID Configurations

Btrfs has built-in RAID support—no mdadm needed. Select a configuration to see how data is distributed, then simulate a disk failure.

RAID 1

Mirroring—duplicate data on all devices

Min Disks

Performance

2x read, 1x write

Fault Tolerance

1 disk

Usable Space

50%

Disk 1

Disk 2(mirror)

Create RAID 1 filesystem:

mkfs.btrfs -d raid1 -m raid1 /dev/sda1 /dev/sdb1

RAID 5/6 Warning

Btrfs RAID 5 and RAID 6 have a "write hole" bug and are not production-ready. For parity RAID, use ZFS or mdadm + Btrfs on top. Stick to RAID 1/10 for Btrfs-native redundancy.

Creating Multi-Device Filesystems

# Create RAID 1 (mirrored) filesystem
sudo mkfs.btrfs -d raid1 -m raid1 /dev/sdb /dev/sdc

# Create RAID 10 (striped mirrors)
sudo mkfs.btrfs -d raid10 -m raid10 /dev/sdb /dev/sdc /dev/sdd /dev/sde

# Add device to existing filesystem
sudo btrfs device add /dev/sdd /mnt

# Convert single device to RAID 1
sudo btrfs balance start -dconvert=raid1 -mconvert=raid1 /mnt

# Remove failed device
sudo btrfs device remove /dev/sdc /mnt

# Replace failed device
sudo btrfs replace start /dev/sdc /dev/sdd /mnt

Transparent Compression: More Space, Often Faster

Btrfs supports transparent compression—files are compressed on write and decompressed on read, completely invisible to applications. Surprisingly, compression often improves performance by reducing disk I/O.

Compression Comparison

Btrfs supports transparent compression. Compare algorithms to find the best balance of speed and space savings for your workload.

File Type:

Source code, logs, configs

None

1000 KB →1000 KB

100% of original

100%

LZO

1000 KB →450 KB-55%

45% of original

95%

Zlib

1000 KB →350 KB-65%

35% of original

40%

Zstd:1

1000 KB →380 KB-62%

38% of original

90%

Zstd:3

1000 KB →320 KB-68%

32% of original

75%

Zstd:9

1000 KB →280 KB-72%

28% of original

35%

Zstd:15

1000 KB →260 KB-74%

26% of original

15%

Zstd:3

Default, great balance

68%

Space Saved

75%

Speed

3.1x

Ratio

Mount Command

mount -o compress=zstd:3 /dev/sda1 /mnt

Recommendation: zstd:3

Zstd level 3 is the sweet spot for most workloads—fast enough that you won't notice it, with compression ratios rivaling zlib. For SSDs, compression can actually improve performance by reducing write amplification. Skip compression for already-compressed files (images, videos, archives).

Compression Algorithms

Algorithm	Speed	Ratio	Best For
zstd (default)	Fast	Good	General use, recommended
lzo	Fastest	Lower	Real-time workloads
zlib	Slow	Best	Archival, cold storage

# Mount with compression
sudo mount -o compress=zstd:3 /dev/sdb1 /mnt

# Force compression (even for incompressible files)
sudo mount -o compress-force=zstd /dev/sdb1 /mnt

# Check compression ratio
sudo compsize /mnt

# Enable compression on existing data (requires balance)
sudo btrfs filesystem defragment -czstd -r /mnt/data

When NOT to Compress

Disable compression for already-compressed or random-access data:

# Disable CoW and compression for VMs/databases
sudo chattr +C /var/lib/libvirt/images/
sudo chattr +C /var/lib/mysql/

Data Integrity: Checksums and Self-Healing

Unlike ext4 or XFS, Btrfs checksums every block of data and metadata. This means it can detect "bit rot"—silent corruption from hardware errors that other filesystems miss entirely.

Btrfs Scrub & Data Integrity

Btrfs checksums every data block. The btrfs scrub command verifies all checksums, detects silent corruption, and repairs data from RAID mirrors if available.

Simulate CorruptionRAID Mirror

Progress: 0/12 blocksReady

Verified

Errors

Repaired

Data Loss

None

config.txt

CRC:61F173

data.db

CRC:61F173

photo.jpg

CRC:61F173

notes.md

CRC:CORRUPT

app.bin

CRC:61F173

log.txt

CRC:61F173

config.txt

CRC:61F173

data.db

CRC:CORRUPT

photo.jpg

CRC:61F173

notes.md

CRC:61F173

#10

app.bin

CRC:61F173

#11

log.txt

CRC:61F173

Run a scrub operation:

# Start scrub on mounted filesystem
sudo btrfs scrub start /mnt/data

# Check scrub status
sudo btrfs scrub status /mnt/data

# View detailed stats
sudo btrfs device stats /mnt/data

Silent Corruption Protection

Unlike ext4/XFS, Btrfs uses CRC32C checksums for both data and metadata. This detects "bit rot"—silent corruption from hardware errors that traditional filesystems miss. Schedule monthly scrubs with systemd timers for proactive data protection.

The Power of Scrubbing

The btrfs scrub command reads all data, verifies checksums, and (with RAID) repairs corruption automatically:

# Start scrub (runs in background)
sudo btrfs scrub start /mnt

# Check scrub status
sudo btrfs scrub status /mnt

# View device error statistics
sudo btrfs device stats /mnt

# Schedule monthly scrubs with systemd
sudo systemctl enable btrfs-scrub@-.timer  # For root filesystem

Checksum Protection

Btrfs uses CRC32C checksums (hardware-accelerated on modern CPUs):

Data blocks: Detect silent corruption
Metadata: Protect directory structures
Parent pointers: Verify tree integrity

With RAID, Btrfs can automatically repair corruption by copying good data from mirrors.

Creating and Managing Btrfs

Creating a Btrfs Filesystem

# Single device
sudo mkfs.btrfs /dev/sdb1

# With label
sudo mkfs.btrfs -L "DataDrive" /dev/sdb1

# Multiple devices (RAID 1)
sudo mkfs.btrfs -d raid1 -m raid1 /dev/sdb1 /dev/sdc1

# For SSD
sudo mkfs.btrfs -O discard /dev/nvme0n1p1

Essential Mount Options

# Recommended for SSD
sudo mount -o ssd,discard=async,compress=zstd,noatime,space_cache=v2 /dev/sdb1 /mnt

# Recommended for HDD
sudo mount -o compress=zstd,autodefrag,noatime,space_cache=v2 /dev/sdb1 /mnt

# Add to /etc/fstab
UUID=xxx /mnt btrfs defaults,compress=zstd,noatime,space_cache=v2 0 0

Maintenance Commands

# Check space usage (more accurate than df)
sudo btrfs filesystem usage /mnt

# Show device allocation
sudo btrfs device usage /mnt

# Balance (redistribute data across devices)
sudo btrfs balance start /mnt

# Balance with filters (metadata only)
sudo btrfs balance start -musage=50 /mnt

# Defragment (breaks CoW sharing!)
sudo btrfs filesystem defragment -czstd -r /mnt/data

Common Pitfalls and Solutions

1. "No Space Left" with Plenty of Free Space

Problem: Btrfs shows space available but refuses writes.

Cause: Metadata is full while data has space (or vice versa).

Solution:

# Check what's actually full
sudo btrfs filesystem usage /mnt

# Balance metadata (usually the culprit)
sudo btrfs balance start -musage=50 /mnt

# If desperate, add a small device temporarily
sudo btrfs device add /dev/ram0 /mnt
sudo btrfs balance start /mnt
sudo btrfs device remove /dev/ram0 /mnt

2. RAID 5/6 Data Loss

Problem: Using RAID 5 or 6 leads to data corruption.

Cause: Known "write hole" bug remains unfixed.

Solution: Use RAID 1 or RAID 10 with Btrfs. For parity RAID, use mdadm + Btrfs on top, or switch to ZFS.

3. Slow Performance with Databases/VMs

Problem: Database or VM performance is terrible.

Cause: Copy-on-Write causes fragmentation and write amplification for random writes.

Solution:

# Disable CoW for the directory (must be empty first!)
sudo mkdir /var/lib/mysql
sudo chattr +C /var/lib/mysql

# Or mount with nodatacow for specific subvolumes
sudo mount -o subvol=@databases,nodatacow /dev/sdb1 /var/lib/mysql

4. Snapshot Space Explosion

Problem: Disk fills up even though files haven't grown.

Cause: Snapshots preserve old data—nothing is truly deleted.

Solution:

# List snapshots and their space usage
sudo btrfs subvolume list -s /mnt

# Delete old snapshots
sudo btrfs subvolume delete /mnt/snapshots/old-snapshot

# Use snapper's cleanup policies
sudo snapper -c root set-config TIMELINE_LIMIT_HOURLY=5

5. Can't Delete Subvolume

Problem: btrfs subvolume delete fails with "directory not empty".

Cause: Nested subvolumes exist inside the target.

Solution:

# List nested subvolumes
sudo btrfs subvolume list -o /mnt/target

# Delete from deepest to shallowest
sudo btrfs subvolume delete /mnt/target/nested/deep
sudo btrfs subvolume delete /mnt/target/nested
sudo btrfs subvolume delete /mnt/target

Performance Tuning

SSD Optimization

# Optimal mount options for NVMe/SSD
defaults,ssd,discard=async,compress=zstd:1,noatime,space_cache=v2

HDD Optimization

# Optimal mount options for spinning disks
defaults,compress=zstd:3,autodefrag,noatime,space_cache=v2

Database/VM Workloads

# Disable CoW for random-write heavy directories
chattr +C /var/lib/postgresql/
chattr +C /var/lib/libvirt/images/

# Consider nodatasum for databases (they have their own checksums)
mount -o subvol=@databases,nodatacow,nodatasum /dev/sdb1 /var/lib/mysql

Btrfs vs ZFS vs ext4

Feature	Btrfs	ZFS	ext4
Copy-on-Write	✓	✓	✗
Snapshots	✓	✓	✗
Compression	✓	✓	✗
Checksums	✓	✓	✗ (metadata only)
Built-in RAID	Partial	Full	✗
Linux mainline	✓	✗ (license)	✓
RAM usage	Low	High (1GB+)	Very Low
Stability	Good	Excellent	Excellent
Repair tools	Basic	Excellent	Excellent

When to Use Btrfs

✅ Perfect for:

Desktop/laptop systems: Snapshots for easy recovery, compression saves space
Development environments: Quick rollback of system changes
Container hosts: Efficient storage drivers for Docker/Podman
Home NAS: Built-in RAID 1/10 with data integrity
Backup servers: Send/receive for efficient incremental replication

❌ Avoid for:

Databases with heavy writes: Use ext4/XFS with nodatacow, or ZFS
RAID 5/6 requirements: Use ZFS or mdadm instead
Maximum stability requirements: ext4 remains more battle-tested
Very large scale (100+ TB): ZFS handles extreme scale better

Copy-on-Write: The foundational mechanism explained in depth
Snapshots: How instant point-in-time copies work
Data Integrity: Checksums, scrubbing, and corruption detection
Mount Options: Performance tuning and security settings
Filesystems Overview: Compare Btrfs with other Linux filesystems
ZFS: Alternative CoW filesystem with different trade-offs
ext4: The traditional, reliable Linux filesystem
RAID Storage: Understanding redundancy levels and trade-offs