Skip to main content

Filesystems: The Digital DNA of Data Storage

Summary
Explore Linux filesystems through interactive visuals. Learn VFS, compare ext4 vs Btrfs vs ZFS, and understand file operations.

What is a Filesystem?

A filesystem organizes raw storage into a tree of files and directories. It tracks where each file's bytes live on disk, who can access them, and how to find them quickly. Every read or write — saving a document, loading a program, streaming a video — passes through this layer.

Without a filesystem, a disk is just a flat array of blocks. Programs have no way to ask "give me the file I saved yesterday" — they would have to remember which exact block they wrote it to. The filesystem layer is the bookkeeping that makes durable, named storage possible.

A File Operation, Step by Step

Opening a file is one of the most common operations a program performs, and it involves several layers of the kernel cooperating. The visualization below shows each layer the request passes through.

The Journey of a File Operation

Step 1 of 5: Application Request — Your app calls open("/home/user/file.txt")

Application Request

Your app calls open("/home/user/file.txt")

System call initiated
Context switch to kernel mode
File path parsed
Step 1 of 5

Where the Time Goes

Log scale — each step is roughly 10×. The kernel path costs microseconds; the storage device is the real cost.

Kernel path

VFS lookup~1 µs
Filesystem driver~10 µs
Block layer~5 µs

Storage device — your file lives on one

SSD access~100 µs
HDD seek~10 ms

The device dominates. An HDD seek (~10 ms) is roughly 100× an SSD's (~100 µs) — and both dwarf the entire kernel path.

The VFS Layer

The Virtual File System (VFS) is a kernel abstraction layer that gives every filesystem a common interface. Applications call standard syscalls — open, read, write, stat — and VFS dispatches to the right filesystem driver underneath. The application never has to know whether the file lives on ext4, NTFS, NFS, or a virtual filesystem like /proc.

What VFS Enables

This abstraction is what makes the rest of Linux possible:

  • Hot-swap filesystems without changing apps
  • Network filesystems appear local
  • Virtual filesystems (/proc, /sys) expose kernel data
  • Stackable filesystems (encryption, compression)

Anatomy of a Filesystem

Every filesystem, from the ancient FAT to the futuristic ZFS, shares fundamental components. Understanding these building blocks helps you choose the right filesystem and troubleshoot issues:

The Superblock

Master control record containing: UUID, block count, block size, filesystem features, state (clean/dirty).

sudo dumpe2fs /dev/sda1 | head -20

Inodes

Inodes store file metadata (permissions, timestamps, block pointers) but not filenames. Every file has a unique inode number.

Data Blocks

Actual file content lives in blocks (typically 4KB). Large files use multiple blocks.

The Filesystem Landscape

A handful of filesystems dominate Linux storage today. They differ in copy-on-write support, integrity guarantees, snapshot models, and tooling ecosystems. The comparison below summarizes the trade-offs.

Filesystem Comparison Tool

Select filesystems to compare (max 4)

Performance Comparison

Sequential read

ext4
95%
Btrfs
80%
ZFS
75%

Sequential write

ext4
95%
Btrfs
75%
ZFS
70%

Random I/O

ext4
90%
Btrfs
70%
ZFS
65%

Quick Recommendations

For beginners: ext4 (simple and reliable)
For servers: ZFS (data integrity)
For containers: Btrfs (snapshots)
For performance: XFS (large files)

Established Filesystems

ext4

ext4 is Linux's default filesystem—not flashy, but incredibly dependable:

# Create an ext4 filesystem sudo mkfs.ext4 -L "MyData" /dev/sdb1 # Mount with optimal options sudo mount -o noatime,nodiratime /dev/sdb1 /mnt/data

Why ext4 is the default choice:

  • Fast for general workloads, with no surprises
  • 20+ years of refinement and a wide tool ecosystem
  • Universally supported on every Linux distribution

Use ext4 for:

  • Root partitions (/)
  • General-purpose storage
  • Virtual machine disks
  • Any workload where "just works" reliability matters

XFS

XFS excels at parallel I/O and massive files:

# Create XFS optimized for large files sudo mkfs.xfs -d agcount=32 /dev/sdb1 # More allocation groups = more parallelism # Mount for database workloads sudo mount -o noatime,nobarrier,logbufs=8 /dev/sdb1 /mnt/database

Where XFS excels:

  • Parallel I/O via allocation groups
  • Very large files (up to 8 exabytes)
  • Streaming and sequential workloads
  • Online defragmentation without downtime

Use XFS for:

  • Media and video workstations
  • Scientific computing clusters
  • Database servers
  • Any parallel-I/O-heavy workload

Copy-on-Write Filesystems

Btrfs

Btrfs brings ZFS-like features to Linux with native kernel integration:

# Create Btrfs with compression sudo mkfs.btrfs -L "ModernStorage" /dev/sdb1 # Mount with transparent compression sudo mount -o compress=zstd:3 /dev/sdb1 /mnt/data # Take instant snapshot sudo btrfs subvolume snapshot /mnt/data /mnt/data/.snapshots/backup-$(date +%Y%m%d)

Btrfs features:

  • Copy-on-write snapshots with no upfront space cost
  • Transparent compression (zstd, lzo) for 30–50% space savings on compressible data
  • Built-in checksums for self-healing on data integrity errors
  • Subvolumes that behave like lightweight partitions

Use Btrfs for:

  • Desktops and laptops where easy rollback is valuable
  • Container hosts (Docker, Podman)
  • Development environments
  • Home NAS systems

ZFS

ZFS prioritizes data integrity above all else:

# Create ZFS pool with mirror sudo zpool create tank mirror /dev/sdb /dev/sdc # Enable compression and deduplication sudo zfs set compression=lz4 tank sudo zfs set dedup=on tank # Warning: needs lots of RAM! # Create encrypted dataset sudo zfs create -o encryption=on -o keyformat=passphrase tank/secure

ZFS features:

  • End-to-end checksums for data integrity
  • Pooled storage spanning multiple devices
  • Snapshots and clones with copy-on-write semantics
  • Native encryption at the dataset level

Use ZFS for:

  • Enterprise storage arrays
  • Backup servers
  • Workloads where data loss is unacceptable
  • FreeBSD and Solaris systems

Cross-Platform Filesystems

NTFS

Linux can read and write NTFS through the FUSE-based NTFS-3G driver:

# Mount Windows drive on Linux sudo mount -t ntfs-3g /dev/sdb1 /mnt/windows # Better performance with big_writes sudo mount -t ntfs-3g -o big_writes,noatime /dev/sdb1 /mnt/windows

FAT32 and exFAT

FAT32 and exFAT are widely supported across every consumer device, from cameras to car stereos:

# Format USB drive with exFAT (no 4GB file limit) sudo mkfs.exfat -n "USB Drive" /dev/sdb1 # Mount with proper permissions sudo mount -o uid=1000,gid=1000 /dev/sdb1 /mnt/usb

Mounting

Mounting attaches a filesystem to a point in the directory tree, making its contents accessible at that path:

# Basic mount sudo mount /dev/sdb1 /mnt/data # Mount with specific options sudo mount -o rw,noatime,compress=zstd /dev/sdb1 /mnt/data # Make permanent in /etc/fstab echo "UUID=xxx /mnt/data btrfs defaults,compress=zstd 0 2" | sudo tee -a /etc/fstab # Mount everything in fstab sudo mount -a

The Filesystem Hierarchy

Linux organizes its filesystem like a tree, with / as the root:

/ # Root of the filesystem tree ├── /home # User home directories │ └── /home/you # Your home ├── /etc # System-wide configuration ├── /var # Variable data (logs, spool, caches) ├── /tmp # Temporary files, cleared on reboot ├── /usr # User-space programs and libraries ├── /opt # Optional or third-party software └── /mnt # Mount points for external storage

Choosing a Filesystem

Quick Decision Guide

Need maximum compatibility?

Use exFAT for removable media, ext4 for Linux.

Handling huge files or parallel workloads?

Choose XFS for maximum throughput.

Want snapshots and modern features?

Pick Btrfs for Linux, ZFS for the strongest data integrity guarantees.

Dual-booting with Windows?

Keep Windows on NTFS, Linux on ext4, share via exFAT.

Performance Tuning Tips

Speed Optimization

# Disable access time updates (5-10% performance boost) mount -o noatime /dev/sdb1 /mnt/fast # Increase commit interval (better for SSDs) mount -o commit=60 /dev/sdb1 /mnt/ssd # Enable write barriers (safer but slower) mount -o barrier=1 /dev/sdb1 /mnt/safe

Space Optimization

# Enable compression (Btrfs/ZFS) mount -o compress=zstd:3 /dev/sdb1 /mnt/compressed # Reserve less space for root (ext4) tune2fs -m 1 /dev/sdb1 # Only reserve 1% instead of 5% # Enable deduplication (ZFS) - needs RAM! zfs set dedup=on pool/dataset

Common Gotchas and Solutions

Running Out of Inodes

# Check inode usage df -i # If full, even with free space: # 1. Delete small files # 2. Reformat with more inodes # 3. Use filesystem that allocates dynamically (Btrfs/ZFS)

Fragmentation

# Check fragmentation (ext4) e4defrag -c /dev/sdb1 # Defragment online (ext4/XFS/Btrfs) e4defrag /dev/sdb1 # ext4 xfs_fsr /mnt/xfs # XFS btrfs filesystem defrag -r /mnt/btrfs # Btrfs

Recovery from Corruption

# Try read-only mount first mount -o ro /dev/sdb1 /mnt/recovery # Run filesystem check fsck.ext4 -f /dev/sdb1 # ext4 xfs_repair /dev/sdb1 # XFS btrfs check --repair /dev/sdb1 # Btrfs (use carefully!)

Filesystems sit between raw storage and every program that reads or writes data. Choosing the right one is a tradeoff between speed, integrity, feature surface, and ecosystem support. ext4 covers most general-purpose needs; XFS scales further for parallel and large-file workloads; Btrfs and ZFS add snapshots and checksumming when data integrity matters.

Further Reading

Explore Specific Filesystems

Ready to dive deeper? Explore each filesystem in detail:

If you found this explanation helpful, consider sharing it with others.

Mastodon