Why Dynamic Linking
When you compile a C++ program, the linker must resolve every function call to an address. Static linking copies library code into your binary — simple but wasteful. If 50 programs use libc, that’s 50 copies of the same code in memory and on disk.
Dynamic linking solves this by deferring resolution to runtime. The binary contains references to shared libraries (.so files on Linux, .dylib on macOS, .dll on Windows), and the dynamic linker resolves these references when the program starts — or even later, on first use.
The tradeoffs:
| Aspect | Static Linking | Dynamic Linking |
|---|---|---|
| Binary size | Large (includes all library code) | Small (just references) |
| Memory | Each process has its own copy | Shared across all processes |
| Dependencies | None at runtime | Must have correct .so versions |
| Startup time | Instant | Slightly slower (symbol resolution) |
| Updates | Must recompile to update library | Library updates apply to all programs |
| Deployment | Single file | Must ship with dependencies |
How Dynamic Linking Works
The dynamic linker (ld.so on Linux) is itself a shared library that the kernel loads before your program starts. It reads your binary’s dependency list, maps each shared library into memory, and resolves symbol references.
Position-Independent Code (PIC)
Shared libraries must work at any memory address because the dynamic linker loads them wherever there’s space (ASLR randomizes this further). Code that works regardless of where it’s loaded is called position-independent code (PIC).
The compiler achieves PIC by accessing global data through the Global Offset Table (GOT) — an array of pointers that the dynamic linker fills in at load time. Instead of hardcoding addresses, PIC code loads from the GOT:
# Without PIC: hardcoded address (won't work if library loads elsewhere) mov eax, [0x804a020] # absolute address of global_var # With PIC: GOT-relative access (works at any load address) mov eax, [ebx + GOT_OFFSET] # ebx = GOT base, filled at load time
# Compile with PIC (required for shared libraries) g++ -fPIC -c mylib.cpp -o mylib.o # Without -fPIC, the linker will reject the .o for shared library creation g++ -shared -o libmylib.so mylib.o # fails if mylib.o wasn't compiled with -fPIC
Why -fPIC Matters
Without -fPIC, the compiler generates code with hardcoded addresses that
require text relocations at load time. Text relocations modify the code
segment itself, which means it can’t be shared between processes
(defeating the purpose of shared libraries) and breaks W^X security policies.
GOT/PLT: Symbol Resolution
Function calls through shared libraries use a two-level indirection: the Procedure Linkage Table (PLT) and the Global Offset Table (GOT). This mechanism enables lazy binding — symbols are resolved on first use, not at startup.
Lazy vs Eager Binding
By default, the dynamic linker uses lazy binding: function addresses are resolved the first time they’re called. This speeds up startup because unused functions are never resolved.
Eager binding (LD_BIND_NOW=1 or -Wl,-z,now) resolves all symbols at load time. This is slower to start but:
- Detects missing symbols immediately (before
main()runs) - Required for Full RELRO security hardening
- Avoids unpredictable latency spikes from first-call resolution
# Lazy binding (default) ./program # Eager binding — all symbols resolved before main() LD_BIND_NOW=1 ./program # Compile with Full RELRO (eager binding + read-only GOT) g++ -Wl,-z,relro,-z,now -o program main.cpp -lmylib
Creating Shared Libraries
The complete workflow from source to installed shared library:
# 1. Compile with PIC g++ -fPIC -c mylib.cpp -o mylib.o # 2. Create shared library with SONAME g++ -shared -Wl,-soname,libmylib.so.1 -o libmylib.so.1.2.0 mylib.o # 3. Create symlinks (the linker and loader use different names) ln -sf libmylib.so.1.2.0 libmylib.so.1 # SONAME link (loader uses this) ln -sf libmylib.so.1 libmylib.so # Linker link (g++ -lmylib uses this) # 4. Install sudo cp libmylib.so.1.2.0 /usr/local/lib/ sudo ldconfig # updates linker cache # 5. Link your program against it g++ main.cpp -L/usr/local/lib -lmylib -o program
Library Dependencies
Every shared library can depend on other shared libraries, forming a dependency tree. ldd shows the full tree:
# Show direct and transitive dependencies ldd /usr/bin/python3 # Show only direct dependencies (no recursion) readelf -d /usr/bin/python3 | grep NEEDED # Find which package provides a library dpkg -S libssl.so.3 # Debian/Ubuntu rpm -qf /usr/lib64/libssl.so.3 # RHEL/Fedora
SONAME Versioning
Shared libraries use a three-level naming convention to handle backward compatibility:
libfoo.so.2.1.0 │ │ │ └── Patch version (bug fixes, no API changes) │ │ └──── Minor version (new features, backward compatible) │ └────── Major version (SONAME — breaking changes) └────────────── Library name
The SONAME (libfoo.so.2) is the key: programs link against the SONAME, not the full version. This means you can install libfoo.so.2.2.0 alongside libfoo.so.2.1.0, and programs that linked against libfoo.so.2 automatically use the newer version.
Breaking changes require a new SONAME (libfoo.so.3), and programs must be recompiled.
RPATH, RUNPATH, and LD_LIBRARY_PATH
The dynamic linker searches for libraries in this order:
- RPATH (embedded in binary, searched before LD_LIBRARY_PATH)
- LD_LIBRARY_PATH (environment variable)
- RUNPATH (embedded in binary, searched after LD_LIBRARY_PATH)
- /etc/ld.so.cache (ldconfig cache)
- /lib, /usr/lib (default system paths)
# Embed RPATH in binary (searched first) g++ -Wl,-rpath,/opt/mylibs -o program main.cpp -lmylib # Use $ORIGIN for relative paths (portable) g++ -Wl,-rpath,'$ORIGIN/../lib' -o program main.cpp -lmylib # Check RPATH/RUNPATH in a binary readelf -d program | grep -E 'RPATH|RUNPATH' # Override at runtime LD_LIBRARY_PATH=/custom/lib ./program
RPATH Security
RPATH is searched before LD_LIBRARY_PATH, which means a binary with RPATH can bypass the user’s library preferences. For setuid programs, the dynamic linker ignores both LD_LIBRARY_PATH and LD_PRELOAD entirely to prevent privilege escalation.
Runtime Loading: dlopen
The dlopen API loads shared libraries at runtime, enabling plugin architectures where the program discovers and loads functionality dynamically:
Complete Plugin Pattern with Error Handling
#include <dlfcn.h> #include <cstdio> #include <cstring> // Plugin interface (shared header) typedef int (*init_fn)(void); typedef const char* (*name_fn)(void); typedef void (*process_fn)(const char* data); struct Plugin { void* handle; init_fn init; name_fn name; process_fn process; }; Plugin load_plugin(const char* path) { Plugin p = {}; p.handle = dlopen(path, RTLD_LAZY); if (!p.handle) { fprintf(stderr, "dlopen: %s\n", dlerror()); return p; } // Clear existing errors dlerror(); p.init = (init_fn)dlsym(p.handle, "plugin_init"); p.name = (name_fn)dlsym(p.handle, "plugin_name"); p.process = (process_fn)dlsym(p.handle, "plugin_process"); const char* err = dlerror(); if (err) { fprintf(stderr, "dlsym: %s\n", err); dlclose(p.handle); p.handle = nullptr; } return p; }
LD_PRELOAD: Function Interposition
LD_PRELOAD loads a library before all others, allowing you to override any function in any shared library. The dynamic linker uses the first symbol it finds, so your preloaded version wins:
#define _GNU_SOURCE #include <dlfcn.h> #include <stdio.h> #include <stdlib.h> // Override malloc to add tracking void* malloc(size_t size) { // Get the real malloc static void* (*real_malloc)(size_t) = NULL; if (!real_malloc) real_malloc = dlsym(RTLD_NEXT, "malloc"); void* ptr = real_malloc(size); fprintf(stderr, "malloc(%zu) = %p\n", size, ptr); return ptr; }
# Run any program with malloc tracking — no recompilation needed LD_PRELOAD=./malloc_wrapper.so python3 my_script.py
Practical uses: memory leak detectors, performance profiling, testing (mock network calls), hardware abstraction layers.
Debugging Dynamic Linking
LD_DEBUG: The Dynamic Linker’s Verbose Mode
# Show library search paths LD_DEBUG=libs ./program # Show every symbol lookup LD_DEBUG=symbols ./program # Show file operations (opens, stats) LD_DEBUG=files ./program # Show everything LD_DEBUG=all ./program 2>debug.log
Common Debugging Tools
# List all dependencies (recursive) ldd ./program # Show dynamic section (NEEDED, SONAME, RPATH) readelf -d ./program # List exported symbols nm -D libmylib.so # List undefined symbols (what it needs) nm -D --undefined-only ./program # Trace library calls at runtime ltrace ./program # Trace system calls (open() shows which files are accessed) strace -e openat ./program 2>&1 | grep '\.so'
Security Considerations
ASLR (Address Space Layout Randomization)
The kernel randomizes where shared libraries are loaded in memory, making exploits harder. PIC is essential for ASLR — without it, libraries must load at fixed addresses.
RELRO (Relocation Read-Only)
Partial RELRO (default): the GOT is writable — an attacker who controls a write primitive can overwrite GOT entries to redirect function calls.
Full RELRO (-Wl,-z,relro,-z,now): all relocations are resolved at load time and the GOT is marked read-only. GOT overwrites become impossible.
# Check RELRO status readelf -l program | grep GNU_RELRO checksec --file=program # if checksec is installed
Key Takeaways
-
Dynamic linking shares code across processes — one copy of libc in memory serves every program. Saves disk and RAM.
-
GOT/PLT enables lazy resolution — function addresses are resolved on first call, not at startup. Faster launch but unpredictable first-call latency.
-
PIC is mandatory for shared libraries —
-fPICgenerates position-independent code that works at any load address. Without it, text relocations break sharing. -
SONAME versioning prevents breakage — programs link against
libfoo.so.2, notlibfoo.so.2.1.0. Minor updates are transparent. -
dlopen enables plugin architectures — load code at runtime, discover symbols dynamically, unload when done. Always check dlerror().
-
LD_PRELOAD overrides any function — interpose malloc, intercept network calls, add instrumentation without recompilation.
-
Full RELRO + eager binding for security —
-Wl,-z,relro,-z,nowmakes the GOT read-only after resolution, blocking a common exploit vector.
Related Concepts
- Linking Process: Static linking, symbol resolution, and how the linker builds executables
- Object Files & Symbols: ELF structure, symbol tables, and relocations
- Memory Layout & Loading: How the OS maps executables and libraries into memory
- Compilation Pipeline: From source to object code before linking
Further Reading
- How To Write Shared Libraries - Ulrich Drepper’s definitive guide to shared library design
- ELF Specification - The Executable and Linkable Format standard
- Dynamic Linker Man Page - ld.so/ld-linux.so reference with search path details
- Anatomy of a Program in Memory - Visual guide to process memory layout
