Skip to main content

NTFS Filesystem: The Master File Table

Summary
NTFS internals from the Master File Table outward: 1 KB attribute records, resident vs non-resident $DATA, run lists, alternate data streams, the $LogFile journal, and why dual-boot Linux distros prefer ntfs3 over ntfs-3g.

What NTFS is

NTFS (New Technology File System) is Microsoft's primary filesystem, introduced with Windows NT in 1993 and still the default for every Windows installation today. Understanding NTFS matters beyond Windows itself — external drives, dual-boot setups, NAS shares served to Windows clients, and any forensic or recovery work on a Windows volume all involve NTFS internals.

The defining design choice is that everything is a file record in a database-like structure. That's what gives NTFS its modern features — journaling, ACLs, alternate data streams, online resize — that the older FAT family lacks entirely.

The core problem NTFS solves

How do you efficiently organise millions of files on a multi-terabyte drive, with rich metadata per file, and still recover cleanly from a crash? FAT's two-table design becomes unwieldy past a few million entries — the linked-list traversal turns every directory listing into a long scan, and there's no journal to recover from a half-completed write.

NTFS replaces the linked-list model with a single, self-describing table — the Master File Table — and treats every file as a small collection of typed attribute records inside that table.

The Master File Table

The MFT is itself a file. Specifically, it's the first file on the volume: MFT record 0 is $MFT and its $DATA attribute is the MFT itself, recursive. Every other file and directory on the volume gets a record in the MFT.

Each record is a fixed size — 1 KB by default, configurable at format time — and is laid out as a small header followed by a sequence of typed attributes. The most common attributes:

  • $STANDARD_INFORMATION — timestamps (created / modified / accessed / MFT-modified), DOS attribute bits, security ID.
  • $FILE_NAME — the name and the parent directory's record number. A file can have multiple $FILE_NAME attributes — that's how hard links work.
  • $DATA — the file contents. A file can have multiple $DATA attributes, each with a different name. That's how alternate data streams work.
  • $SECURITY_DESCRIPTOR — the ACL.
  • $INDEX_ROOT and $INDEX_ALLOCATION — used by directories to store their B-tree of children.

Resident vs. non-resident: the small-file optimisation

Small files don't need to leave the MFT at all. If a file's $DATA attribute fits within the 1 KB record (after subtracting the header and the other attributes), NTFS stores it inline as a resident attribute. Reading the file becomes a single read of the MFT record — metadata and content arrive together.

When the content doesn't fit, NTFS makes the attribute non-resident: it allocates clusters on disk for the bytes and stores a run list in the record. The run list is a sequence of (starting cluster, length) pairs that maps logical file offsets to physical positions on disk.

MFT Record (1024 bytes)
Record Header
48 bytes - signature, flags, sequence number
$STANDARD_INFORMATION
Timestamps, file attributes, security ID
$FILE_NAME
"note.txt" - parent directory reference
$DATA (Resident)
Actual file content stored here (500 bytes)

Resident: Small files fit entirely within the MFT record. No separate cluster allocation needed. Single disk read retrieves both metadata and content.

The crossover where a $DATA attribute spills out of the record is around 700 bytes — varies depending on how full the other attributes are. Below that crossover, a small text file, shortcut, or registry hive entry costs only the MFT read; above it, NTFS pays the cluster-allocation cost like any other filesystem.

Inside a single MFT record

The neat thing about attribute-based records is that a single 1 KB slot can carry a remarkable variety of file kinds. Toggle through a few:

Patterns to notice across the scenarios:

  • The first three attributes are nearly free. $STANDARD_INFORMATION, $FILE_NAME, and $SECURITY_DESCRIPTOR together fit in about 260 bytes — leaving 700 bytes for the data itself or for additional attributes.
  • A run list is cheap until fragmentation gets out of hand. A contiguous 8 MB photo only spends 32 bytes on its run list. A fragmented 40 MB log file across 18 runs eats 320 bytes — still less than 1/3 of the record.
  • Hard links share the record, not just the bytes. Each $FILE_NAME attribute is its own ~96-byte entry; the $DATA is referenced once. Renaming a hard-linked file is a tiny edit to one attribute.

If the attribute count or sizes overflow the 1 KB record, NTFS spills into an attribute list — a special attribute that points to additional MFT records holding the rest. Used in practice for highly fragmented files and files with many ACL entries.

Alternate data streams

The pluralised-$DATA model is what gives NTFS one of its strangest features. A single file can carry multiple named $DATA streams accessed with filename:streamname syntax, all on the same MFT record:

The most visible ADS in practice is Zone.Identifier — Windows attaches it to any file downloaded from a browser or extracted from a foreign-origin archive. SmartScreen and Office's Protected View read it on open to decide whether to warn you. Deleting just the ADS (Remove-Item -Stream Zone.Identifier) is how you "unblock" a downloaded file without changing its content.

Other real-world ADS uses worth knowing about:

  • macOS over SMB. Finder uses AFP_AfpInfo, AFP_Resource, and similar streams to preserve resource forks and Finder flags on Windows-served shares.
  • Old-school malware hiding. Pre-Vista malware sometimes hid payloads in ADS because dir didn't show them. Modern antivirus scans all streams, but you'll still encounter the technique in CTF challenges and forensic dumps.
  • Application annotation. Some indexing tools, photo managers, and editors stash thumbnails or notes in ADS so they survive moves without showing up in normal listings.

ADS are silently dropped on copy to any filesystem that doesn't support them — FAT32, ext4, network shares without ADS support. The default $DATA survives; everything else evaporates without warning.

System metadata files

The first 16 MFT records are reserved for the filesystem's own bookkeeping. The records that matter most:

RecordNameRole
0$MFTThe Master File Table itself — yes, recursive.
1$MFTMirrBackup copy of the first four MFT records, written to a separate location.
2$LogFileThe journal — every metadata change is logged here before being applied to its target record.
3$VolumeVolume label, NTFS version, dirty flag.
4$AttrDefSchema describing every attribute type.
5.The root directory.
6$BitmapOne bit per cluster — the allocation bitmap.
7$BootBoot sector, BIOS Parameter Block, MFT location.
8$BadClusA sparse file whose only purpose is to "own" known-bad clusters and keep the allocator away from them.

These records are hidden from normal directory listings but visible to forensic tools and fsutil. Corruption of records 0, 2, 6, or 7 typically means an unrecoverable volume.

How $LogFile makes recovery boring

NTFS is a metadata-only journaling filesystem: every modification to an MFT record, a directory index, or the allocation bitmap is first written as a transaction record to $LogFile, then applied to the target structure. Two outcomes after a crash:

  1. Committed but not yet applied — the recovery code rolls the transaction forward, redoing the change.
  2. Started but not committed — the recovery code rolls the transaction back, undoing whatever partial change reached disk.

The mount-time recovery completes in seconds even on multi-TB volumes because it only has to scan the log, not the whole filesystem. There is no traditional fsck step on a clean shutdown.

Data inside $DATA attributes is not journaled — only the metadata that describes where the data lives. A crashing application can lose recently-written file contents, but the filesystem itself stays consistent. This is the same trade-off ext4 makes by default.

NTFS features the MFT enables

Several advanced features fall out of the attribute model almost for free:

  • Compression and encryption happen at the $DATA attribute level, not whole-file. LZNT1 compression and EFS encryption can be enabled per-stream.
  • Reparse points ($REPARSE_POINT attribute) are how NTFS implements symlinks, junctions, mount points, and OneDrive's placeholder files.
  • Quotas and object IDs are extra attributes that don't change the basic record layout.
  • Object IDs ($OBJECT_ID) give files a stable GUID that survives moves, which is what makes Windows' "find this file even after rename" tracking work.

NTFS on Linux

Linux has two paths for NTFS:

  • ntfs3 — the in-kernel driver merged in 5.15 (2021), written by Paragon. Full read/write, fast, recommended for any modern kernel.
  • ntfs-3g — the older FUSE driver. Still works, still maintained, but pays a heavy round-trip cost through user-space.

If you're on a kernel < 5.15, ntfs-3g is fine for occasional access but expect ~3× slower writes than the same volume on Windows. On 5.15+, mount with -t ntfs3.

chkdsk (Windows) and ntfsfix (Linux) are the two repair tools. ntfsfix is intentionally conservative — it handles minor inconsistencies but defers to Windows' chkdsk for anything serious. For NTFS volumes that hold important data, do the deep repair from Windows.

When NTFS is the right choice

Use NTFS when:

  • You're on Windows — it's the only filesystem the OS will install onto.
  • You need to share large files across Windows machines and care about ACLs.
  • You want compression or encryption applied per-file without managing an extra layer.
  • You're handling forensic or recovery work — the rich metadata makes timeline reconstruction much easier.

Pick something else when:

  • The drive moves between OSes and you need transparent compatibility — exFAT works on every modern OS without driver tricks.
  • You're on Linux and don't need Windows interop — ext4 or Btrfs deliver better performance and richer features.
  • The volume is a simple USB stick storing files under 4 GB — FAT32 still works everywhere.

If you found this explanation helpful, consider sharing it with others.

Mastodon