Skip to main content

How Docker Works with GPUs: Device Files, Bind Mounts, and Driver Stacks

Understand how containerized processes access GPU hardware through device files, bind mounts, and the NVIDIA container runtime. Learn the kernel driver vs user-space library distinction.

Best viewed on desktop for optimal interactive experience

Your ML training container needs GPU access. But containers are supposed to be isolated — they have their own filesystem, their own process tree, their own network. How does a containerized process talk to physical GPU hardware?

The answer is surprisingly simple once you understand mount namespaces. GPU access is fundamentally a filesystem problem: applications talk to GPUs through device files, and the container runtime makes those files visible by bind-mounting them into the container’s mount namespace.

How Linux Exposes GPUs

Device Files Are the Interface

GPUs don’t have a special API. They appear to userspace as device files in /dev/, just like disks, terminals, and random number generators. The NVIDIA kernel driver (nvidia.ko) creates these character devices when it loads:

$ ls -la /dev/nvidia* crw-rw-rw- 1 root root 195, 0 Mar 14 10:00 /dev/nvidia0 crw-rw-rw- 1 root root 195, 1 Mar 14 10:00 /dev/nvidia1 crw-rw-rw- 1 root root 195, 255 Mar 14 10:00 /dev/nvidiactl crw-rw-rw- 1 root root 510, 0 Mar 14 10:00 /dev/nvidia-uvm crw-rw-rw- 1 root root 510, 1 Mar 14 10:00 /dev/nvidia-uvm-tools

When a CUDA program runs, it doesn’t talk to the GPU directly. It opens /dev/nvidia0 (or whichever GPU) and issues ioctl() syscalls through that file descriptor. The kernel routes those calls to the registered NVIDIA kernel driver, which actually communicates with the hardware. So GPU access = file access.

GPU Device Files

How GPUs appear as device files and how containers gain access

Host /dev/

Container /dev/

No NVIDIA device files visible

Container cannot see GPU hardware — no NVIDIA device files exist in the container's mount namespace.

What NVIDIA Container Runtime Does

A bare container has no GPU access. Its /dev/ directory contains only standard devices (null, zero, pts). The NVIDIA container runtime (nvidia-container-runtime) solves this by injecting three things into the container before the application starts:

  1. Device nodes — bind mounts /dev/nvidia0, /dev/nvidiactl, /dev/nvidia-uvm from host
  2. Driver libraries — bind mounts libcuda.so, libnvidia-ml.so and other driver-matched libraries
  3. Device permissions — configures the cgroup device controller to allow access to NVIDIA device major numbers

All three are required. Device files alone aren’t enough (CUDA needs libcuda.so). Libraries alone aren’t enough (they need the device files to issue ioctl() calls). And even with both, the cgroup device controller must explicitly allow access.

GPU Bind Mount Process

What nvidia-container-runtime Does

0
Bare Container
1
Mount Device
2
Mount Driver
3
Mount Utilities
4
Set Device
nvidia-container-runtime
$# Container starts with standard devices only
_
Container Filesystem
/dev
null
zero
pts
shm
/usr/lib
(empty)
/usr/bin
(empty)

Step 0: Bare Container

The OCI runtime creates the container with standard devices. No GPU access yet.

Why Bind Mounts?

Bind mounts are mount namespace operations. They make a file or directory from one location appear at another location — even across namespace boundaries. The NVIDIA runtime bind-mounts host files into the container’s mount namespace, so the container sees them as if they were part of its own filesystem. No copying, no overhead — it’s the same file, just visible from a different mount table.

The Two Things Called “NVIDIA Driver”

This is where most confusion happens. People say “NVIDIA driver” to mean two completely different things.

Kernel Driver (nvidia.ko)

The real driver. It’s a kernel module loaded on the host via modprobe nvidia. It runs in kernel space, talks directly to the GPU over PCIe, manages hardware resources, and handles memory allocation. Only one version can exist per kernel — you can’t run two different versions of nvidia.ko simultaneously.

$ lsmod | grep nvidia nvidia_uvm 1503232 0 nvidia_drm 77824 0 nvidia_modeset 1306624 1 nvidia_drm nvidia 56446976 2 nvidia_uvm,nvidia_modeset

All containers share this kernel driver because all containers share the host kernel. This is a fundamental property of Linux containers.

User-Space Libraries (libcuda.so)

These are not drivers. They’re client libraries that talk to the kernel driver through device files. libcuda.so is the CUDA driver API — it opens /dev/nvidia0 and issues ioctl() calls. libcudart.so is the CUDA runtime API that most applications use. libcublas.so, libcudnn.so, and others are higher-level libraries built on top.

Different containers can ship different versions of these libraries. The full call stack looks like:

PyTorch libcudart.so (CUDA Runtime — lives in container) libcuda.so (CUDA Driver API — bind-mounted from host) /dev/nvidia0 (device file — bind-mounted from host) ↓ ioctl() syscall — crosses user/kernel boundary nvidia.ko (kernel driver — host kernel) GPU Hardware

Kernel Driver vs User-Space Libraries

How the GPU software stack is split between container and host

Inside containerOn host (shared)
Python / PyTorch(Application)
libcudart.so (CUDA Runtime)(User-space)
libcuda.so (CUDA Driver API)(User-space)
/dev/nvidia0(Device file)
User / Kernel boundary (ioctl syscall)
nvidia.ko (Kernel Driver)(Kernel-space)
GPU Hardware (A100 / H100)(Hardware)

CUDA Version Compatibility

Because the kernel driver is shared but CUDA libraries can differ per container, version compatibility matters. The rule is forward compatibility: newer host drivers support older CUDA toolkits, but not the reverse.

A host running NVIDIA driver 550 can serve containers using CUDA 11.8, 12.0, 12.1, or 12.4. But a host running driver 535 cannot serve a container using CUDA 12.4 — the container’s libcuda.so would try to call kernel driver APIs that don’t exist in the older driver.

CUDA Version Compatibility Matrix

CUDA Forward Compatibility

Driver / CUDACUDA 11.8CUDA 12.0CUDA 12.1CUDA 12.4
Driver 535
Driver 545
Driver 550
Driver 555

Rule of thumb: upgrade the host driver, not the container's CUDA. A newer host driver is backward-compatible with all older CUDA toolkit versions.

Compatible
Incompatible

The Golden Rule

You cannot run a newer CUDA toolkit than your host driver supports. The container’s CUDA libraries call into the host kernel driver — if the driver is too old, those calls fail with “CUDA driver version is insufficient.” Upgrade the host driver, not the container.

GPU Access Methods Compared

There are several ways to give containers GPU access. They differ significantly in security and production-readiness.

GPU Access Methods Compared

Comparing container GPU passthrough approaches across isolation, security, usability, and production readiness.

NVIDIA Container Toolkit (--gpus all)

Isolation
GoodOnly requested GPUs exposed via resource flags
Security
GoodMinimal permissions, cgroup device controller enforced
Ease of Use
GoodAutomatic device + library mounting, just add --gpus
Production Ready
Yes

--device /dev/nvidia0

Isolation
ModerateManual device selection, must specify each device
Security
ModerateDevice exposed but libraries must be manually mounted
Ease of Use
PoorMust know device paths, manually mount driver libraries
Production Ready
No

--privileged

Isolation
NoneFull access to ALL host devices
Security
NoneAll capabilities granted, all devices accessible, effectively root on host
Ease of Use
GoodEverything just works, no configuration needed
Production Ready
No

Kubernetes Device Plugin

Isolation
GoodResource requests allocate specific GPUs per pod
Security
GoodPlugin manages device permissions and node scheduling
Ease of Use
GoodDeclarative: resources.limits.nvidia.com/gpu: 1
Production Ready
Yes
Use in production

NVIDIA Container Toolkit for Docker, Kubernetes Device Plugin for K8s. Both handle device mounting, library injection, and cgroup permissions automatically.

Avoid in production

--privileged gives the container full host access. --device alone misses library mounting and is fragile across driver updates.

Common Pitfalls

nvidia-smi works but CUDA fails

nvidia-smi uses libnvidia-ml.so to query the management interface. CUDA uses libcuda.so to submit compute work. They’re different code paths. If nvidia-smi shows your GPU but CUDA programs fail, the CUDA libraries are missing or mismatched — check that libcuda.so is properly mounted and its version matches the host kernel driver.

“No CUDA-capable device is detected”

The device files aren’t visible in the container. Either the runtime didn’t inject them (missing --gpus flag) or the cgroup device controller is blocking access. Check ls /dev/nvidia* inside the container.

Driver version mismatch

The container expects a newer CUDA version than the host driver supports. Check nvidia-smi on the host for the driver version, then verify it meets the minimum requirement for the container’s CUDA toolkit.

Container can’t see GPU after host driver update

After updating the host kernel driver, existing containers still have the old libcuda.so bind-mounted. Restart the container to pick up the new library version.

Key Takeaways

1.

GPUs are device files — applications access them by opening /dev/nvidia0 and issuing ioctl() calls through the file descriptor.

2.

GPU in containers = bind mounts — the NVIDIA runtime mounts device files, driver libraries, and sets cgroup permissions.

3.

Kernel driver is shared — all containers use the same nvidia.ko because they share the host kernel. One version per host.

4.

User-space libraries can differ — each container can ship its own CUDA toolkit version as long as the host driver supports it.

5.

Forward compatibility only — newer drivers support older CUDA, never the reverse. Upgrade the host driver, not the container.

6.

Three things are needed — device nodes, driver libraries, and cgroup device permissions. Missing any one of them breaks GPU access.

If you found this explanation helpful, consider sharing it with others.

Mastodon