Abstract Syntax Tree (AST)
The AST is a tree representation of your code's syntactic structure. Each node represents a construct in the source code, from functions to expressions.
Parsing Phases
1. Lexical Analysis (Tokenization)
Breaks code into tokens:
- Keywords:
int,if,return - Identifiers: variable and function names
- Literals:
42,"string",3.14 - Operators:
+,->,::
2. Syntax Analysis
Builds the AST according to grammar rules:
function_declaration ├── return_type ├── function_name ├── parameter_list └── compound_statement └── statements...
3. Semantic Analysis
- Type checking
- Name resolution
- Template instantiation
- Overload resolution
Why AST Matters
- Enables optimization: Compilers analyze and transform the tree
- Powers tooling: IDEs use AST for refactoring and analysis
- Template processing: AST manipulation for template instantiation
- Error detection: Semantic errors found through tree analysis
Viewing the AST
# Clang AST dump clang++ -Xclang -ast-dump main.cpp # GCC tree dump g++ -fdump-tree-original main.cpp
Common AST Nodes
- FunctionDecl: Function declarations
- CompoundStmt: Block statements
{...} - IfStmt: Conditional statements
- CallExpr: Function calls
- BinaryOperator: Binary operations (+, -, *, /)
- DeclRefExpr: Variable references
Next Steps
- Learn about Compiler Optimization
- Understand Symbol Tables
- Explore the Preprocessor
