Filesystem Journaling: Write-Ahead Logging

The Problem: Crashes During Writes

Imagine you're updating a file when suddenly—power failure! Without protection, your filesystem could be left in an inconsistent state:

Half-written metadata: Directory entries point to freed blocks
Orphaned data: Allocated blocks with no file reference
Corrupted structures: Inconsistent inode tables, bitmaps

Traditional filesystems required full disk scans (fsck) after crashes—potentially hours on large drives. Journaling solves this with write-ahead logging.

The Journaling Solution

Core Idea: Before making any changes, write your intentions to a journal (transaction log). If a crash occurs, replay the journal to complete or undo partial operations.

Think of it like a chef's prep notes: write down what you're about to cook before you start. If interrupted, check your notes to know what state you're in.

How Journaling Works: Interactive Exploration

See the journaling mechanism in action—from transaction start to commit, and crash recovery:

Normal Operation: Transaction Flow

Step 1 of 5

Initial State: Filesystem Consistent

Transaction Journal

Journal empty

Filesystem State

doc.txt

Size: 10KB

Blocks: 100, 101

State: consistent

Filesystem in consistent state

Journal is empty (or only old transactions)

Last checkpoint at position 0

Ready to process new write operation

User wants to append data to doc.txt

Journal Modes: Safety vs Performance

Different journaling modes offer varying guarantees:

1. Journal Mode (Full Journaling)

What's journaled: Both metadata AND data
Process: Write data to journal → Write metadata to journal → Commit → Write to final location
Safety: Highest - complete consistency
Performance: Slowest - everything written twice
Use case: Critical data (financial systems)

2. Ordered Mode (Default)

What's journaled: Only metadata
Process: Write data to disk → Write metadata to journal → Commit → Write metadata to final location
Safety: High - metadata consistent, data may be old
Performance: Balanced - data written once
Use case: Most systems (ext4 default, XFS)

3. Writeback Mode

What's journaled: Only metadata
Process: Write metadata to journal → Commit → Write data and metadata to disk (any order)
Safety: Lower - metadata consistent, data may be garbage
Performance: Fastest - no ordering constraints
Use case: Non-critical data, scratch disks

The Transaction Lifecycle

1. Transaction Start
   ↓
2. Write Intent to Journal (WAL)
   ↓
3. Wait for Journal Flush
   ↓
4. Commit Transaction (Commit Record)
   ↓
5. Apply Changes to Filesystem
   ↓
6. Mark Journal Entries as Completed (Checkpoint)
   ↓
7. Reuse Journal Space (Circular Buffer)

Recovery After Crash

When a filesystem mounts after a crash:

Scan Journal: Read from last checkpoint
Check Commit Records: Find complete vs incomplete transactions
Replay Complete: Apply committed but not-yet-applied changes
Rollback Incomplete: Ignore uncommitted transactions
Mount Filesystem: Now in consistent state

Recovery Time: Seconds to minutes (scanning journal only), not hours (scanning entire disk).

Journaling in Different Filesystems

ext4

Journal: Dedicated journal inode or external device
Modes: journal, ordered (default), writeback
Journal size: Configurable (default ~128MB)
Command: tune2fs -o journal_data /dev/sda1

XFS

Journal: Metadata-only (always ordered mode)
Log size: Configurable with -l size=128m
Real-time log: Optional separate device for sync writes
Efficient: Only logs metadata changes

NTFS

$LogFile: Transaction log for metadata
Mode: Metadata journaling only
Recovery: Automatic on mount (chkdsk if needed)
USN Journal: Separate change journal for applications

Btrfs / ZFS

No traditional journal: Use Copy-on-Write instead
Atomic operations: CoW provides transaction semantics
See: Copy-on-Write mechanism

Performance Impact

Journal Placement

# Internal journal (default)
mkfs.ext4 /dev/sda1

# External journal (faster, separate device)
mkfs.ext4 -J device=/dev/sdb1 /dev/sda1

External journal benefits:

Reduced seek time (journal on SSD, data on HDD)
Parallel I/O
Better for write-heavy workloads

Journal Size Tuning

# Larger journal = more buffering, less frequent commits
tune2fs -J size=400 /dev/sda1  # 400MB journal

# For databases: larger journal reduces checkpoint frequency

Best Practices

Use ordered mode for most workloads (default)
Enable full journaling only for critical data
External journal on separate SSD for performance
Monitor journal wraps: dumpe2fs /dev/sda1 | grep -i journal
Disable journaling only for scratch/tmp filesystems

Journaling vs Copy-on-Write

Aspect	Journaling	CoW (Btrfs/ZFS)
Method	Write-ahead log	Never overwrite
Overhead	Write twice (journal + final)	Write once (new location)
Recovery	Replay journal	Always consistent
Snapshots	Not supported	Free with CoW
Maturity	Very mature	Newer (Btrfs)

When to choose:

Journaling (ext4, XFS): Maximum maturity, proven reliability
CoW (Btrfs, ZFS): Want snapshots, checksums, modern features

Systems & Architecture

Copy-on-Write (CoW): Never Overwrite, Always Preserve

Understand Copy-on-Write (CoW) in Btrfs and ZFS. Learn how CoW enables instant snapshots, atomic writes, and data integrity.

Systems & Architecture

Btrfs: Modern Copy-on-Write Filesystem

Learn the Btrfs filesystem with built-in snapshots, RAID, and compression. Explore copy-on-write, subvolumes, and self-healing on Linux.

Systems & Architecture

ext4: The Linux Workhorse Filesystem

Explore ext4, the default Linux filesystem with journaling, extents, and proven reliability. Learn how ext4 protects your data.

Systems & Architecture

FAT32 & exFAT: Universal Filesystems

Learn FAT32 and exFAT filesystems for cross-platform USB drives and SD cards. Understand file size limits and compatibility.

Systems & Architecture

Filesystems: The Digital DNA of Data Storage

Explore Linux filesystems through interactive visuals. Learn VFS, compare ext4 vs Btrfs vs ZFS, and understand file operations.

Systems & Architecture

Inodes: The Hidden Metadata That Powers Every File

Understand Linux inodes - the metadata structures behind every file. Learn about hard links, soft links, and inode limits.