Python Bytecode Compilation

What is Python Bytecode?

Python bytecode is an intermediate representation of Python source code. When you run a Python program, it's first compiled to bytecode, which is then executed by the Python Virtual Machine (PVM).

Python Bytecode Visualization

Python Source

def add(a, b):
    return a + b

result = add(3, 5)

Bytecode Instructions

0LOAD_CONST0(<code object add>)

2LOAD_CONST1('add')

4MAKE_FUNCTION0

6STORE_NAME0(add)

8LOAD_NAME0(add)

10LOAD_CONST2(3)

12LOAD_CONST3(5)

14CALL_FUNCTION2

16STORE_NAME1(result)

18LOAD_CONST4(None)

20RETURN_VALUE

How Python Bytecode Works

• Python compiles source code to bytecode before execution
• Bytecode is platform-independent and cached in .pyc files
• The Python Virtual Machine (PVM) executes bytecode instructions
• Each instruction manipulates the value stack and local/global namespaces

The Compilation Process

1. Source Code to Tokens

Python first tokenizes your source code:

# Source code
def add(a, b):
    return a + b

# Tokens
NAME 'def'
NAME 'add'
OP '('
NAME 'a'
OP ','
NAME 'b'
OP ')'
OP ':'
# ...

2. Tokens to AST

The tokens are parsed into an Abstract Syntax Tree:

import ast
import inspect

def add(a, b):
    return a + b

# Get the AST
tree = ast.parse(inspect.getsource(add))
print(ast.dump(tree))

3. AST to Bytecode

The AST is compiled to bytecode instructions:

import dis

def add(a, b):
    return a + b

dis.dis(add)
# Output:
#   2           0 LOAD_FAST                0 (a)
#               2 LOAD_FAST                1 (b)
#               4 BINARY_ADD
#               6 RETURN_VALUE

Understanding Bytecode Instructions

Common Instructions

Instruction	Description	Stack Effect
`LOAD_FAST`	Load local variable	Push value
`LOAD_GLOBAL`	Load global variable	Push value
`STORE_FAST`	Store to local variable	Pop value
`BINARY_ADD`	Add two values	Pop 2, push 1
`CALL_FUNCTION`	Call a function	Pop args+func, push result
`RETURN_VALUE`	Return from function	Pop value
`POP_TOP`	Remove top of stack	Pop 1
`JUMP_IF_FALSE`	Conditional jump	Check top

The Value Stack

Python's VM is stack-based. Operations manipulate a value stack:

# Expression: a + b * c
# Bytecode execution:

LOAD_FAST    0 (a)    # Stack: [a]
LOAD_FAST    1 (b)    # Stack: [a, b]
LOAD_FAST    2 (c)    # Stack: [a, b, c]
BINARY_MUL            # Stack: [a, b*c]
BINARY_ADD            # Stack: [a+b*c]

Bytecode Caching

.pyc Files

Python caches compiled bytecode in __pycache__ directories:

mymodule.py
__pycache__/
    mymodule.cpython-311.pyc  # Python 3.11 bytecode

Cache Validation

Python checks if the source has changed:

Compare modification timestamps
Compare source hash (Python 3.7+)
Recompile if needed

Inspecting Bytecode

Using the dis Module

import dis

# Disassemble a function
def factorial(n):
    if n <= 1:
        return 1
    return n * factorial(n - 1)

dis.dis(factorial)

Bytecode Objects

# Access bytecode directly
code = factorial.__code__

print(f"Argument count: {code.co_argcount}")
print(f"Local variables: {code.co_nlocals}")
print(f"Stack size: {code.co_stacksize}")
print(f"Constants: {code.co_consts}")
print(f"Variable names: {code.co_varnames}")
print(f"Bytecode: {code.co_code.hex()}")

Control Flow in Bytecode

Conditional Execution

def check_positive(x):
    if x > 0:
        return "positive"
    return "non-positive"

# Bytecode uses jumps:
# LOAD_FAST        0 (x)
# LOAD_CONST       1 (0)
# COMPARE_OP       4 (>)
# POP_JUMP_IF_FALSE to 8
# LOAD_CONST       2 ('positive')
# RETURN_VALUE
# LOAD_CONST       3 ('non-positive')
# RETURN_VALUE

Loops

def sum_range(n):
    total = 0
    for i in range(n):
        total += i
    return total

# Loop bytecode uses:
# GET_ITER
# FOR_ITER
# JUMP_ABSOLUTE (back to loop start)

Python 3.11+ Improvements

Adaptive Bytecode

Python 3.11 specializes bytecode based on runtime behavior:

def add_numbers(a, b):
    return a + b

# First calls: BINARY_ADD (generic)
# After ~8 int additions: BINARY_ADD_INT (specialized)

Inline Caching

Frequently accessed attributes are cached inline:

# Before: LOAD_ATTR requires dictionary lookup
# After: LOAD_ATTR_SLOT uses cached offset

Performance Implications

What's Fast

Local variable access (LOAD_FAST)
Built-in operations
Specialized bytecode (3.11+)

What's Slow

Global variable access (LOAD_GLOBAL)
Attribute lookup (LOAD_ATTR)
Function calls (CALL_FUNCTION)

Practical Example

# Original code
def process_list(items):
    result = []
    for item in items:
        if item > 0:
            result.append(item * 2)
    return result

# More efficient (fewer bytecode instructions)
def process_list_optimized(items):
    return [item * 2 for item in items if item > 0]

Key Takeaways

Python compiles to bytecode before execution
Bytecode is cached in .pyc files for faster imports
The PVM is stack-based - operations manipulate a value stack
Local variables are fastest - they use indexed access
Python 3.11+ adapts bytecode based on runtime behavior

Understanding bytecode helps you write more efficient Python code and debug performance issues at a deeper level.