How Docker Works with GPUs: Device Files, Bind Mounts, and Driver Stacks

Your ML training container needs GPU access. But containers are supposed to be isolated — they have their own filesystem, their own process tree, their own network. How does a containerized process talk to physical GPU hardware?

The answer is surprisingly simple once you understand mount namespaces. GPU access is fundamentally a filesystem problem: applications talk to GPUs through device files, and the container runtime makes those files visible by bind-mounting them into the container’s mount namespace.

How Linux Exposes GPUs

Device Files Are the Interface

GPUs don’t have a special API. They appear to userspace as device files in /dev/, just like disks, terminals, and random number generators. The NVIDIA kernel driver (nvidia.ko) creates these character devices when it loads:

$ ls -la /dev/nvidia*
crw-rw-rw- 1 root root 195,   0 Mar 14 10:00 /dev/nvidia0
crw-rw-rw- 1 root root 195,   1 Mar 14 10:00 /dev/nvidia1
crw-rw-rw- 1 root root 195, 255 Mar 14 10:00 /dev/nvidiactl
crw-rw-rw- 1 root root 510,   0 Mar 14 10:00 /dev/nvidia-uvm
crw-rw-rw- 1 root root 510,   1 Mar 14 10:00 /dev/nvidia-uvm-tools

When a CUDA program runs, it doesn’t talk to the GPU directly. It opens /dev/nvidia0 (or whichever GPU) and issues ioctl() syscalls through that file descriptor. The kernel routes those calls to the registered NVIDIA kernel driver, which actually communicates with the hardware. So GPU access = file access.

GPU Device Files

How GPUs appear as device files and how containers gain access

Host /dev/

Container /dev/

No NVIDIA device files visible

Container cannot see GPU hardware — no NVIDIA device files exist in the container's mount namespace.

What NVIDIA Container Runtime Does

A bare container has no GPU access. Its /dev/ directory contains only standard devices (null, zero, pts). The NVIDIA container runtime (nvidia-container-runtime) solves this by injecting three things into the container before the application starts:

Device nodes — bind mounts /dev/nvidia0, /dev/nvidiactl, /dev/nvidia-uvm from host
Driver libraries — bind mounts libcuda.so, libnvidia-ml.so and other driver-matched libraries
Device permissions — configures the cgroup device controller to allow access to NVIDIA device major numbers

All three are required. Device files alone aren’t enough (CUDA needs libcuda.so). Libraries alone aren’t enough (they need the device files to issue ioctl() calls). And even with both, the cgroup device controller must explicitly allow access.

GPU Bind Mount Process

What nvidia-container-runtime Does

Bare Container

Mount Device

Mount Driver

Mount Utilities

Set Device

nvidia-container-runtime

$# Container starts with standard devices only

Container Filesystem

/dev

null

zero

pts

shm

/usr/lib

(empty)

/usr/bin

(empty)

Step 0: Bare Container

The OCI runtime creates the container with standard devices. No GPU access yet.

Why Bind Mounts?

Bind mounts are mount namespace operations. They make a file or directory from one location appear at another location — even across namespace boundaries. The NVIDIA runtime bind-mounts host files into the container’s mount namespace, so the container sees them as if they were part of its own filesystem. No copying, no overhead — it’s the same file, just visible from a different mount table.

The Two Things Called “NVIDIA Driver”

This is where most confusion happens. People say “NVIDIA driver” to mean two completely different things.

Kernel Driver (nvidia.ko)

The real driver. It’s a kernel module loaded on the host via modprobe nvidia. It runs in kernel space, talks directly to the GPU over PCIe, manages hardware resources, and handles memory allocation. Only one version can exist per kernel — you can’t run two different versions of nvidia.ko simultaneously.

$ lsmod | grep nvidia
nvidia_uvm           1503232  0
nvidia_drm             77824  0
nvidia_modeset       1306624  1 nvidia_drm
nvidia              56446976  2 nvidia_uvm,nvidia_modeset

All containers share this kernel driver because all containers share the host kernel. This is a fundamental property of Linux containers.

User-Space Libraries (libcuda.so)

These are not drivers. They’re client libraries that talk to the kernel driver through device files. libcuda.so is the CUDA driver API — it opens /dev/nvidia0 and issues ioctl() calls. libcudart.so is the CUDA runtime API that most applications use. libcublas.so, libcudnn.so, and others are higher-level libraries built on top.

Different containers can ship different versions of these libraries. The full call stack looks like:

PyTorch
  ↓
libcudart.so (CUDA Runtime — lives in container)
  ↓
libcuda.so (CUDA Driver API — bind-mounted from host)
  ↓
/dev/nvidia0 (device file — bind-mounted from host)
  ↓ ioctl() syscall — crosses user/kernel boundary
nvidia.ko (kernel driver — host kernel)
  ↓
GPU Hardware

Kernel Driver vs User-Space Libraries

How the GPU software stack is split between container and host

Inside containerOn host (shared)

Lives INSIDE container (can differ per container)

Python / PyTorch(Application)

libcudart.so (CUDA Runtime)(User-space)

libcuda.so (CUDA Driver API)(User-space)

/dev/nvidia0(Device file)

User / Kernel boundary (ioctl syscall)

nvidia.ko (Kernel Driver)(Kernel-space)

GPU Hardware (A100 / H100)(Hardware)

Lives on HOST (shared by all containers)

CUDA Version Compatibility

Because the kernel driver is shared but CUDA libraries can differ per container, version compatibility matters. The rule is forward compatibility: newer host drivers support older CUDA toolkits, but not the reverse.

A host running NVIDIA driver 550 can serve containers using CUDA 11.8, 12.0, 12.1, or 12.4. But a host running driver 535 cannot serve a container using CUDA 12.4 — the container’s libcuda.so would try to call kernel driver APIs that don’t exist in the older driver.

CUDA Version Compatibility Matrix

CUDA Forward Compatibility

Host Driver ↓ / Container CUDA →Driver / CUDA	CUDA 11.8	CUDA 12.0	CUDA 12.1	CUDA 12.4
Driver 535
Driver 545
Driver 550
Driver 555

Rule of thumb: upgrade the host driver, not the container's CUDA. A newer host driver is backward-compatible with all older CUDA toolkit versions.

Compatible

Incompatible

The Golden Rule

You cannot run a newer CUDA toolkit than your host driver supports. The container’s CUDA libraries call into the host kernel driver — if the driver is too old, those calls fail with “CUDA driver version is insufficient.” Upgrade the host driver, not the container.

GPU Access Methods Compared

There are several ways to give containers GPU access. They differ significantly in security and production-readiness.

GPU Access Methods Compared

Comparing container GPU passthrough approaches across isolation, security, usability, and production readiness.

Method	Isolation	Security	Ease of Use	Production Ready
`NVIDIA Container Toolkit (--gpus all)`	GoodOnly requested GPUs exposed via resource flags	GoodMinimal permissions, cgroup device controller enforced	GoodAutomatic device + library mounting, just add --gpus	Yes
`--device /dev/nvidia0`	ModerateManual device selection, must specify each device	ModerateDevice exposed but libraries must be manually mounted	PoorMust know device paths, manually mount driver libraries	No
`--privileged`	NoneFull access to ALL host devices	NoneAll capabilities granted, all devices accessible, effectively root on host	GoodEverything just works, no configuration needed	No
`Kubernetes Device Plugin`	GoodResource requests allocate specific GPUs per pod	GoodPlugin manages device permissions and node scheduling	GoodDeclarative: resources.limits.nvidia.com/gpu: 1	Yes

`NVIDIA Container Toolkit (--gpus all)`

Isolation

GoodOnly requested GPUs exposed via resource flags

Security

GoodMinimal permissions, cgroup device controller enforced

Ease of Use

GoodAutomatic device + library mounting, just add --gpus

Production Ready

Yes

`--device /dev/nvidia0`

Isolation

ModerateManual device selection, must specify each device

Security

ModerateDevice exposed but libraries must be manually mounted

Ease of Use

PoorMust know device paths, manually mount driver libraries

Production Ready

`--privileged`

Isolation

NoneFull access to ALL host devices

Security

NoneAll capabilities granted, all devices accessible, effectively root on host

Ease of Use

GoodEverything just works, no configuration needed

Production Ready

`Kubernetes Device Plugin`

Isolation

GoodResource requests allocate specific GPUs per pod

Security

GoodPlugin manages device permissions and node scheduling

Ease of Use

GoodDeclarative: resources.limits.nvidia.com/gpu: 1

Production Ready

Yes

Use in production

NVIDIA Container Toolkit for Docker, Kubernetes Device Plugin for K8s. Both handle device mounting, library injection, and cgroup permissions automatically.

Avoid in production

--privileged gives the container full host access. --device alone misses library mounting and is fragile across driver updates.

Common Pitfalls

nvidia-smi works but CUDA fails

nvidia-smi uses libnvidia-ml.so to query the management interface. CUDA uses libcuda.so to submit compute work. They’re different code paths. If nvidia-smi shows your GPU but CUDA programs fail, the CUDA libraries are missing or mismatched — check that libcuda.so is properly mounted and its version matches the host kernel driver.

“No CUDA-capable device is detected”

The device files aren’t visible in the container. Either the runtime didn’t inject them (missing --gpus flag) or the cgroup device controller is blocking access. Check ls /dev/nvidia* inside the container.

Driver version mismatch

The container expects a newer CUDA version than the host driver supports. Check nvidia-smi on the host for the driver version, then verify it meets the minimum requirement for the container’s CUDA toolkit.

Container can’t see GPU after host driver update

After updating the host kernel driver, existing containers still have the old libcuda.so bind-mounted. Restart the container to pick up the new library version.

Key Takeaways

GPUs are device files — applications access them by opening /dev/nvidia0 and issuing ioctl() calls through the file descriptor.

GPU in containers = bind mounts — the NVIDIA runtime mounts device files, driver libraries, and sets cgroup permissions.

Kernel driver is shared — all containers use the same nvidia.ko because they share the host kernel. One version per host.

User-space libraries can differ — each container can ship its own CUDA toolkit version as long as the host driver supports it.

Forward compatibility only — newer drivers support older CUDA, never the reverse. Upgrade the host driver, not the container.

Three things are needed — device nodes, driver libraries, and cgroup device permissions. Missing any one of them breaks GPU access.

Linux Namespaces: Mount namespaces enable the bind-mount mechanism that exposes GPUs to containers
Containers Under the Hood: How namespaces + cgroups combine to create containers
Kernel Modules: How nvidia.ko is loaded and managed
cgroups: Device controller that permits or blocks GPU access