What Is Symbol Resolution?
Every C++ build goes through four stages: preprocess, compile, assemble, and link. The first three stages run independently on each .cpp file, producing an object file (.o) per translation unit. The linker’s job is to take all those .o files and connect them together into a single executable.
The core of that connection is symbol resolution — the linker scans every object file, collects the symbols each one defines and the symbols each one references, and then matches every undefined reference to exactly one definition. If a reference has no definition, you get an "undefined reference" error. If a symbol has conflicting definitions, you get a "multiple definition" error. Getting this right is what makes multi-file C++ programs work.
Symbol Types
Every symbol in an object file has a type that tells the linker how to handle it. You can inspect these types with the nm command.
Strong Symbols
Defined functions and initialized global variables are strong symbols. The linker expects exactly one strong definition of each symbol across all object files.
// math.cpp int global_count = 42; // Strong: initialized global (type D) const int MAX_RETRIES = 3; // Strong: const data (type R) int add(int a, int b) { // Strong: function definition (type T) return a + b; } void print_result(int val) { // Strong: function definition (type T) printf("Result: %d\n", val); }
In nm output, strong symbols show up with uppercase letters: T for text (code), D for initialized data, R for read-only data.
Weak Symbols
Uninitialized globals, inline functions, and explicitly weak symbols are weak symbols. The linker allows multiple weak definitions of the same symbol — it picks one and discards the rest. A strong symbol always overrides a weak one.
// config.cpp int uninitialized_count; // Weak: uninitialized global (type B) __attribute__((weak)) int default_timeout = 30; // Explicitly weak: can be overridden inline int square(int x) { return x * x; } // Weak: inline function
Weak symbols let libraries provide default implementations that application code can override without linker errors.
Undefined Symbols
Extern declarations and function calls without definitions in the current translation unit produce undefined symbols. The linker must resolve each one to a definition found in another object file or library.
// main.cpp extern int global_count; // Undefined: declared but not defined here int add(int a, int b); // Undefined: declared but not defined here void print_result(int val); // Undefined: declared but not defined here int main() { int result = add(3, 4); // References the undefined symbol print_result(result); // References the undefined symbol return global_count; // References the undefined symbol }
In nm output, undefined symbols show up as U:
$ nm main.o U _Z3addii U _Z12print_resulti U global_count 0000000000000000 T main 0000000000000000 R MAX_RETRIES
Resolution Rules
The linker follows strict rules when it encounters multiple definitions of the same symbol across object files:
Multiple Strong Symbols → Error
If two object files both contain a strong definition of the same symbol, the linker rejects the program:
// file_a.cpp int config_value = 10; // Strong definition // file_b.cpp int config_value = 20; // Another strong definition — conflict!
$ g++ file_a.o file_b.o -o program /usr/bin/ld: file_b.o: multiple definition of `config_value'; file_a.o: first defined here collect2: error: ld returned 1 exit status
One Strong + One or More Weak → Strong Wins
If one object file has a strong definition and others have weak definitions, the linker picks the strong one without complaint:
// default.cpp __attribute__((weak)) int timeout = 30; // Weak definition (default) // app.cpp int timeout = 60; // Strong definition (override) // The linker uses 60
Multiple Weak Symbols → Linker Picks One
If no strong definition exists and multiple weak definitions are present, the linker selects one (typically the first it encounters). Some linkers emit a warning, but many do not. This can lead to subtle bugs if the weak definitions differ.
No Definition → Undefined Reference Error
If a symbol is referenced but never defined anywhere, the linker fails:
$ g++ main.o -o program /usr/bin/ld: main.o: undefined reference to `add(int, int)' collect2: error: ld returned 1 exit status
Name Mangling
C++ supports function overloading — multiple functions can share the same name as long as their parameter types differ. But the linker works with flat symbol names, not C++ type information. To make overloading work at the linking level, the compiler mangles each function name by encoding the parameter types into the symbol.
Itanium ABI Mangling (GCC/Clang)
Most compilers on Linux and macOS follow the Itanium C++ ABI for name mangling. The mangled name starts with _Z, followed by the name length, the function name, and encoded parameter types:
void func(int) // _Z4funci (i = int) void func(double) // _Z4funcd (d = double) void func(int, int) // _Z4funcii (ii = int, int) int add(int, int) // _Z3addii
Demangling with c++filt
When you see a mangled symbol in an error message or nm output, use c++filt to decode it:
$ echo '_Z3addii' | c++filt add(int, int) $ nm main.o | c++filt U add(int, int) U print_result(int) 0000000000000000 T main
extern "C" to Disable Mangling
When you need to call a C library from C++ (or expose a C++ function to C callers), use extern "C" to disable mangling:
// This function will have the symbol name "c_function", not "_Z10c_functionv" extern "C" void c_function() { // ... } // For C headers included in C++ code extern "C" { #include <legacy_c_library.h> }
Without extern "C", the C++ compiler would mangle the function name, and the C linker wouldn’t be able to find it.
The One Definition Rule (ODR)
The One Definition Rule is one of C++’s most important linking constraints: every symbol must have exactly one definition across all translation units. Violating ODR causes either linker errors (if you’re lucky) or silent undefined behavior (if you’re not).
The Classic ODR Violation
The most common ODR violation is defining a variable in a header file:
// config.h — ODR VIOLATION #ifndef CONFIG_H #define CONFIG_H int MAX_SIZE = 100; // This is a DEFINITION, not just a declaration! #endif
// app.cpp #include "config.h" // Defines MAX_SIZE in app.o // utils.cpp #include "config.h" // Defines MAX_SIZE again in utils.o
When you link app.o and utils.o, the linker sees two strong definitions of MAX_SIZE and rejects the program.
Fixes for ODR Violations
// Fix 1: extern declaration in header + definition in one .cpp // config.h extern int MAX_SIZE; // Declaration only // config.cpp int MAX_SIZE = 100; // Single definition // Fix 2: inline variable (C++17) // config.h inline int MAX_SIZE = 100; // OK — inline is ODR-exempt // Fix 3: constexpr (C++17 — implicitly inline) // config.h constexpr int MAX_SIZE = 100; // OK — constexpr implies inline
ODR-Exempt Entities
Some C++ constructs are allowed to appear in multiple translation units without violating ODR, as long as every definition is identical:
- inline functions and variables — the linker deduplicates identical copies
- template instantiations — each TU that uses a template gets its own copy; the linker merges them
- constexpr variables (C++17) — implicitly inline
If the definitions are not identical across TUs (for example, different macro expansions change the body of an inline function), the program has undefined behavior — and the linker usually won’t warn you.
Template Instantiation
Templates are defined in header files, but the compiler only generates code for a template when it’s actually used. Each .cpp file that uses a template gets its own instantiation, and the linker deduplicates the identical copies.
Why Templates Must Live in Headers
If you put a template definition in a .cpp file, only that translation unit can instantiate it. Other TUs that include only the declaration won’t have the definition available to generate code:
// math.h template<typename T> T square(T x); // Declaration only — no definition // math.cpp template<typename T> T square(T x) { return x * x; } // Definition in .cpp // main.cpp #include "math.h" int main() { return square(5); } // ERROR: undefined reference to 'int square<int>(int)'
The compiler compiling main.cpp sees the declaration of square but not its body, so it can’t generate the int specialization. The definition in math.cpp was never instantiated for int either, because math.cpp doesn’t call square(5).
Fixes
// Fix 1: Move the template definition to the header (most common) // math.h template<typename T> T square(T x) { return x * x; } // Fix 2: Explicit instantiation in the .cpp file // math.cpp template<typename T> T square(T x) { return x * x; } // Explicitly instantiate for the types you need template int square<int>(int); template double square<double>(double);
Explicit instantiation is useful for large templates where you want to control compile times and know all the types in advance. For most cases, keeping the definition in the header is simpler and more flexible.
Symbol Visibility
By default, every function and global variable in a shared library (.so) is visible to any code that links against it. This exposes internal implementation details, inflates the dynamic symbol table, and slows down library loading.
Controlling Visibility
// Hide internal symbols from library users __attribute__((visibility("hidden"))) void internal_helper() { // Not accessible from outside the shared library } // Explicitly export public API __attribute__((visibility("default"))) void public_api() { internal_helper(); // Can still call hidden symbols internally }
Compile-Time Default Visibility
For larger projects, compile the entire library with hidden default visibility and explicitly export only the public API:
# Hide all symbols by default g++ -fvisibility=hidden -shared -o libfoo.so foo.cpp # Only symbols marked with visibility("default") are exported
Why Visibility Matters
- Smaller symbol tables — fewer symbols to look up at load time
- Faster library loading — less work for the dynamic linker
- Prevents accidental ABI exposure — internal functions can change without breaking users
- Enables better optimization — the compiler knows hidden functions can’t be overridden, so it can inline them more aggressively
Linking Order
The order in which you pass object files and libraries to the linker matters — especially for static libraries.
Static Libraries (.a)
Static libraries are archives of object files. The linker processes them left-to-right and only extracts object files that resolve currently undefined symbols. Once a library is processed, it’s discarded.
# main.o calls functions in libapp, which calls functions in libutils # CORRECT: dependents before dependencies g++ main.o -lapp -lutils # WRONG: libutils is processed first, but nothing needs it yet — discarded g++ main.o -lutils -lapp # undefined reference to functions in libutils!
The rule is: put dependents before their dependencies on the command line.
Circular Dependencies
If libA depends on libB and libB depends on libA, no ordering works. Use the linker’s grouping flags:
# GNU ld: process the group repeatedly until no new symbols are resolved g++ main.o -Wl,--start-group -lA -lB -Wl,--end-group # Or list the library twice g++ main.o -lA -lB -lA
Dynamic Libraries (.so)
Dynamic libraries use lazy resolution at runtime, so link order generally doesn’t matter. The dynamic linker resolves symbols when they’re first called, searching all loaded libraries.
Viewing and Debugging Symbols
When linking fails, these command-line tools let you inspect exactly what symbols each object file defines and references.
Essential Commands
# List all symbols with demangled C++ names nm --demangle main.o # Show only undefined symbols (what this file needs) nm -u main.o # Show only defined symbols (what this file provides) nm --defined-only main.o # Detailed section and symbol info objdump -t main.o # ELF-specific symbol table (more detail than nm) readelf -s main.o # Shared library dependencies ldd ./program # Demangle a single symbol echo '_Z3addii' | c++filt # add(int, int)
Reading nm Output
Each line in nm output has three columns: address, type, and symbol name. The type letter tells you what the symbol is:
| Type | Meaning |
|---|---|
T / t | Text (code) section — defined function |
D / d | Initialized data section |
B / b | BSS (uninitialized data) section |
R / r | Read-only data section |
U | Undefined — needs to be resolved |
W / w | Weak symbol |
Uppercase means the symbol is global (visible to the linker); lowercase means it’s local to the object file.
Common Errors and Fixes
Undefined Reference
/usr/bin/ld: main.o: undefined reference to `MyClass::process(int)'
Common causes and fixes:
- Missing implementation — you declared a function but never wrote the body. Write it.
- Missing object file or library — you compiled the implementation but didn’t pass it to the linker. Add
-lmylibormyfile.oto the link command. - C/C++ mismatch — calling a C function from C++ without
extern "C". The C++ compiler mangled the name, but the C library has the unmangled version. - Template in .cpp file — move the template definition to the header or add explicit instantiations.
Multiple Definition
/usr/bin/ld: file_b.o: multiple definition of `config_value'; file_a.o: first defined here
Common causes and fixes:
- Variable defined in header — use
externdeclaration in the header and define in one.cppfile. - Missing include guards — add
#pragma onceor#ifndefguards so the header isn’t included twice in the same TU. - Duplicate source files — the same
.cppfile is compiled twice in your build system. - C++17 fix — use
inlineorconstexprfor header-defined variables.
Symbol Not Found (macOS)
ld: symbol(s) not found for architecture arm64
This is macOS’s equivalent of "undefined reference". The same causes and fixes apply. Check that you’re linking the correct architecture and that the library was built for the same platform.
Related Concepts
- Dynamic Linking: Runtime symbol resolution with GOT/PLT
- Linking Process: The full linking pipeline from
.oto executable - Memory Layout: How the linker arranges sections in the final binary
- Compilation Pipeline: The full build process from
.cppto executable
Further Reading
- Linkers and Loaders - John Levine’s comprehensive book on linking
- How C++ Linkers Work - Practical guide to C++ linking
- Itanium C++ ABI - The name mangling specification used by GCC and Clang
- GCC Symbol Visibility - Guide to controlling exported symbols
