Python Bytecode Compilation

Explore CPython bytecode compilation from source to .pyc files. Learn the dis module, PVM stack operations, and Python 3.11+ adaptive specialization.

Best viewed on desktop for optimal interactive experience

What is Python Bytecode?

Python bytecode is an intermediate representation of Python source code. When you run a Python program, it's first compiled to bytecode, which is then executed by the Python Virtual Machine (PVM).

Python Bytecode Visualization

Python Source

def add(a, b):
    return a + b

result = add(3, 5)

Bytecode Instructions

0LOAD_CONST0(<code object add>)
2LOAD_CONST1('add')
4MAKE_FUNCTION0
6STORE_NAME0(add)
8LOAD_NAME0(add)
10LOAD_CONST2(3)
12LOAD_CONST3(5)
14CALL_FUNCTION2
16STORE_NAME1(result)
18LOAD_CONST4(None)
20RETURN_VALUE

How Python Bytecode Works

  • • Python compiles source code to bytecode before execution
  • • Bytecode is platform-independent and cached in .pyc files
  • • The Python Virtual Machine (PVM) executes bytecode instructions
  • • Each instruction manipulates the value stack and local/global namespaces

The Compilation Process

1. Source Code to Tokens

Python first tokenizes your source code:

# Source code def add(a, b): return a + b # Tokens NAME 'def' NAME 'add' OP '(' NAME 'a' OP ',' NAME 'b' OP ')' OP ':' # ...

2. Tokens to AST

The tokens are parsed into an Abstract Syntax Tree:

import ast import inspect def add(a, b): return a + b # Get the AST tree = ast.parse(inspect.getsource(add)) print(ast.dump(tree))

3. AST to Bytecode

The AST is compiled to bytecode instructions:

import dis def add(a, b): return a + b dis.dis(add) # Output: # 2 0 LOAD_FAST 0 (a) # 2 LOAD_FAST 1 (b) # 4 BINARY_ADD # 6 RETURN_VALUE

Understanding Bytecode Instructions

Common Instructions

InstructionDescriptionStack Effect
LOAD_FASTLoad local variablePush value
LOAD_GLOBALLoad global variablePush value
STORE_FASTStore to local variablePop value
BINARY_ADDAdd two valuesPop 2, push 1
CALL_FUNCTIONCall a functionPop args+func, push result
RETURN_VALUEReturn from functionPop value
POP_TOPRemove top of stackPop 1
JUMP_IF_FALSEConditional jumpCheck top

The Value Stack

Python's VM is stack-based. Operations manipulate a value stack:

# Expression: a + b * c # Bytecode execution: LOAD_FAST 0 (a) # Stack: [a] LOAD_FAST 1 (b) # Stack: [a, b] LOAD_FAST 2 (c) # Stack: [a, b, c] BINARY_MUL # Stack: [a, b*c] BINARY_ADD # Stack: [a+b*c]

Bytecode Caching

.pyc Files

Python caches compiled bytecode in __pycache__ directories:

mymodule.py __pycache__/ mymodule.cpython-311.pyc # Python 3.11 bytecode

Cache Validation

Python checks if the source has changed:

  1. Compare modification timestamps
  2. Compare source hash (Python 3.7+)
  3. Recompile if needed

Inspecting Bytecode

Using the dis Module

import dis # Disassemble a function def factorial(n): if n <= 1: return 1 return n * factorial(n - 1) dis.dis(factorial)

Bytecode Objects

# Access bytecode directly code = factorial.__code__ print(f"Argument count: {code.co_argcount}") print(f"Local variables: {code.co_nlocals}") print(f"Stack size: {code.co_stacksize}") print(f"Constants: {code.co_consts}") print(f"Variable names: {code.co_varnames}") print(f"Bytecode: {code.co_code.hex()}")

Control Flow in Bytecode

Conditional Execution

def check_positive(x): if x > 0: return "positive" return "non-positive" # Bytecode uses jumps: # LOAD_FAST 0 (x) # LOAD_CONST 1 (0) # COMPARE_OP 4 (>) # POP_JUMP_IF_FALSE to 8 # LOAD_CONST 2 ('positive') # RETURN_VALUE # LOAD_CONST 3 ('non-positive') # RETURN_VALUE

Loops

def sum_range(n): total = 0 for i in range(n): total += i return total # Loop bytecode uses: # GET_ITER # FOR_ITER # JUMP_ABSOLUTE (back to loop start)

Python 3.11+ Improvements

Adaptive Bytecode

Python 3.11 specializes bytecode based on runtime behavior:

def add_numbers(a, b): return a + b # First calls: BINARY_ADD (generic) # After ~8 int additions: BINARY_ADD_INT (specialized)

Inline Caching

Frequently accessed attributes are cached inline:

# Before: LOAD_ATTR requires dictionary lookup # After: LOAD_ATTR_SLOT uses cached offset

Performance Implications

What's Fast

  • Local variable access (LOAD_FAST)
  • Built-in operations
  • Specialized bytecode (3.11+)

What's Slow

  • Global variable access (LOAD_GLOBAL)
  • Attribute lookup (LOAD_ATTR)
  • Function calls (CALL_FUNCTION)

Practical Example

# Original code def process_list(items): result = [] for item in items: if item > 0: result.append(item * 2) return result # More efficient (fewer bytecode instructions) def process_list_optimized(items): return [item * 2 for item in items if item > 0]

Key Takeaways

  1. Python compiles to bytecode before execution
  2. Bytecode is cached in .pyc files for faster imports
  3. The PVM is stack-based - operations manipulate a value stack
  4. Local variables are fastest - they use indexed access
  5. Python 3.11+ adapts bytecode based on runtime behavior

Understanding bytecode helps you write more efficient Python code and debug performance issues at a deeper level.

If you found this explanation helpful, consider sharing it with others.

Mastodon