XFS: High-Performance Parallel Filesystem

XFS filesystem internals: allocation groups, extent-based allocation, and delayed allocation for high-performance parallel I/O.

6 min|filesystemsstorageperformance
Best viewed on desktop for optimal interactive experience

What is XFS?

XFS is a high-performance journaling filesystem created by Silicon Graphics (SGI) in 1993 for their IRIX workstations. Ported to Linux in 2001, it's now the default filesystem for Red Hat Enterprise Linux and excels at handling large files and parallel I/O workloads.

Think of XFS as the Formula 1 car of filesystems: purpose-built for speed when working with multi-terabyte datasets and concurrent operations. It trades simplicity for raw performance.

The Core Problem

How do you achieve maximum disk throughput when multiple processes write simultaneously? Traditional filesystems serialize metadata operations through global locks, creating a bottleneck regardless of how fast your storage hardware is.

Allocation Groups: Divide and Conquer

XFS solves this by dividing the filesystem into Allocation Groups (AGs)—independent regions, each with its own:

  • Free space B+ tree - tracks available blocks
  • Inode B+ tree - manages file metadata
  • Lock - controls concurrent access

Toggle below to see why this matters:

AG 0
Writing...
AG 1
Waiting
AG 2
Waiting
AG 3
Waiting
Time to write 4 files:
4x (sequential)

Serial: Traditional filesystems serialize metadata operations through a single lock. Four files must be written one after another, leaving most disk bandwidth unused.

A 4TB filesystem might have 16 AGs of 256GB each. With 16 independent locks, 16 threads can perform metadata operations simultaneously without waiting for each other.

Extent-Based Allocation

XFS doesn't track individual blocks. Instead, it uses extents—contiguous ranges of blocks described by just three values: start block, length, and file offset.

Example: A 100MB file (25,600 blocks) stored contiguously needs just one extent record:

extent: start=1000000, length=25600, offset=0

Compare this to block-based filesystems that need 25,600 individual block pointers. Fewer metadata entries means:

  • Faster file creation
  • Less memory for caching
  • Simpler B+ tree traversal

Delayed Allocation

XFS doesn't allocate blocks immediately when you call write(). Instead:

  1. Reserve - claim space in the filesystem's accounting
  2. Cache - accumulate data in memory
  3. Allocate - assign actual blocks when data is flushed

By waiting, XFS can see the full write pattern and allocate a single contiguous extent rather than scattered blocks. This dramatically reduces fragmentation for streaming workloads.

Key Characteristics

AspectXFS Approach
Max file size8 EiB
Max volume8 EiB
JournalingMetadata only (fast)
ShrinkingNot supported
SnapshotsNot supported (use LVM)

When to Use XFS

Ideal for:

  • Media servers streaming large video files
  • Scientific computing with multi-TB datasets
  • Databases with large tablespaces
  • High-performance computing clusters
  • Any workload with concurrent large file operations

Consider alternatives when:

  • You need filesystem snapshots → use Btrfs or ZFS
  • You need to shrink the filesystem → use ext4
  • Small embedded systems → XFS overhead isn't justified
  • Desktop root filesystem → ext4 is simpler

The Trade-off

XFS optimizes for one thing: maximum throughput for large, parallel workloads. It achieves this by:

  • Accepting that you can't shrink the filesystem
  • Not providing built-in snapshots or compression
  • Using more memory for its extensive B+ tree caching

For the right workload—large files, multiple writers, high-end storage—nothing beats XFS. For general-purpose use, ext4's simplicity often wins.

← Back to Filesystems Overview

If you found this explanation helpful, consider sharing it with others.

Mastodon