What is Shared Memory?
multiprocessing.shared_memory is a Python 3.8+ feature that enables multiple processes to access the same block of memory directly. Unlike pipes or queues that serialize data between processes, shared memory provides raw byte-level access—making it extremely fast but also requiring careful synchronization to avoid race conditions and data corruption.
In typical multiprocessing, each process has its own isolated memory space. When processes need to communicate, data must be serialized (pickled), sent through a pipe or queue, and deserialized—creating significant overhead. Shared memory bypasses this by allowing multiple processes to read and write to the same physical memory region.
Creating and Using Shared Memory
The SharedMemory class creates a named block of memory that can be accessed by any process that knows its name:
from multiprocessing import shared_memory import numpy as np # Process A: Create shared memory block shm = shared_memory.SharedMemory(create=True, size=1024, name="my_shared_block") print(f"Created: {shm.name}, Size: {shm.size} bytes") # Write data to shared memory data = b"Hello from Process A!" shm.buf[:len(data)] = data # Process B: Attach to existing shared memory shm_b = shared_memory.SharedMemory(name="my_shared_block") message = bytes(shm_b.buf[:21]).decode('utf-8') print(f"Read: {message}") # Output: Hello from Process A! # IMPORTANT: Cleanup (covered in detail later) shm_b.close() shm.close() shm.unlink() # Only creator should unlink!
Memory Layout of SharedMemory
A SharedMemory object wraps a raw block of bytes allocated in the operating system's shared memory region. The buf attribute provides a memoryview for direct byte manipulation.
The Danger: Race Conditions
The biggest danger with shared memory is race conditions—when multiple processes read and write to the same memory location simultaneously, leading to data corruption or undefined behavior. Unlike with regular Python objects, there is no Global Interpreter Lock (GIL) protecting shared memory across processes.
⚠️ RACE CONDITION occurs when two or more processes access shared data concurrently, and at least one access is a write. The final state depends on the unpredictable timing of process execution—a recipe for bugs that are nearly impossible to reproduce and debug.
Using SharedMemory with NumPy
One of the most powerful uses of shared memory is with NumPy arrays. By creating a NumPy array that uses shared memory as its buffer, multiple processes can operate on the same numerical data without copying.
import numpy as np from multiprocessing import shared_memory, Process def create_shared_array(shape, dtype=np.float64): """Create a NumPy array backed by shared memory.""" size = np.prod(shape) * np.dtype(dtype).itemsize shm = shared_memory.SharedMemory(create=True, size=size) arr = np.ndarray(shape, dtype=dtype, buffer=shm.buf) return arr, shm def attach_shared_array(name, shape, dtype=np.float64): """Attach to an existing shared NumPy array.""" shm = shared_memory.SharedMemory(name=name) arr = np.ndarray(shape, dtype=dtype, buffer=shm.buf) return arr, shm # Process A: Create shared array arr, shm = create_shared_array((1000, 1000)) arr[:] = np.random.rand(1000, 1000) print(f"Shared memory name: {shm.name}") print(f"Array sum: {arr.sum():.2f}")
Safe Synchronization Methods
To safely use shared memory, you must use synchronization primitives. Python's multiprocessing module provides several options.
Memory Management: close() vs unlink()
Proper cleanup is critical with shared memory. Failing to clean up properly can leave orphaned shared memory blocks in your system, wasting resources until reboot.
⚠️ RESOURCE LEAK occurs when shared memory is not properly cleaned up. On POSIX systems (Linux/macOS), shared memory persists until explicitly unlinked—even after your program exits!
Best Practice: Context Manager Pattern
from contextlib import contextmanager from multiprocessing import shared_memory @contextmanager def managed_shared_memory(name=None, create=False, size=0): """Context manager for automatic shared memory cleanup.""" shm = None try: if create: shm = shared_memory.SharedMemory(create=True, size=size, name=name) else: shm = shared_memory.SharedMemory(name=name) yield shm finally: if shm is not None: shm.close() if create: try: shm.unlink() except FileNotFoundError: pass # Already unlinked # Usage - automatic cleanup guaranteed! with managed_shared_memory(create=True, size=1024) as shm: shm.buf[0:5] = b"Hello" print(f"Name: {shm.name}") # ✓ Automatically closed and unlinked here!
Common Pitfalls and Solutions
| Pitfall | Problem | Solution |
|---|---|---|
| No Synchronization | Race conditions corrupt data | Always use Lock, RLock, or Condition |
| Forgetting unlink() | Shared memory leaks | Use context managers or try/finally |
| Multiple unlink() | FileNotFoundError | Only creator should unlink |
| Wrong size | Data truncation or segfault | Calculate exact byte size needed |
| Endianness issues | Wrong values on different CPUs | Use explicit byte order ('little' or 'big') |
| Using after close() | ValueError or crash | Don't access buf after close() |
| Name collisions | FileExistsError | Use unique names or let Python generate |
Complete API Reference
SharedMemory(name=None, create=False, size=0)
Purpose: Create or attach to shared memory block
Parameters:
name- Unique identifier (auto-generated if None and create=True)create- True to create new, False to attach to existingsize- Bytes to allocate (only when create=True)
Raises: FileExistsError (create existing), FileNotFoundError (attach nonexistent)
shm.buf
Type: memoryview
Purpose: Direct access to shared memory bytes
Usage: shm.buf[0:10] = b"0123456789"
Warning: Do not access after close()!
shm.name
Type: str (read-only)
Purpose: Unique identifier for the shared memory block
Usage: Pass to other processes so they can attach
shm.size
Type: int (read-only)
Purpose: Total bytes allocated
Note: May be larger than requested (page alignment)
shm.close()
Purpose: Release this process's access to shared memory
Required: Every process MUST call this
Effect: buf becomes invalid, block still exists
shm.unlink()
Purpose: Request deletion of shared memory block
Careful: Only creator should call this
Effect: Block deleted when last process detaches
Raises: FileNotFoundError if already unlinked
Real-World Example: Parallel Image Processing
import numpy as np from multiprocessing import shared_memory, Process, Lock import time def process_chunk(shm_name, shape, dtype, start_row, end_row, lock): """Process a chunk of the image (e.g., apply grayscale).""" # Attach to shared memory shm = shared_memory.SharedMemory(name=shm_name) img = np.ndarray(shape, dtype=dtype, buffer=shm.buf) # Process our assigned rows (no lock needed - non-overlapping regions!) for row in range(start_row, end_row): for col in range(shape[1]): # Convert RGB to grayscale r, g, b = img[row, col, 0], img[row, col, 1], img[row, col, 2] gray = int(0.299 * r + 0.587 * g + 0.114 * b) img[row, col] = [gray, gray, gray] # Report progress (needs lock for shared print) with lock: print(f"Processed rows {start_row}-{end_row}") shm.close() def parallel_grayscale(image, num_workers=4): """Convert image to grayscale using multiple processes.""" height, width, channels = image.shape # Create shared memory for the image shm = shared_memory.SharedMemory(create=True, size=image.nbytes) shared_img = np.ndarray(image.shape, dtype=image.dtype, buffer=shm.buf) shared_img[:] = image # Copy data to shared memory lock = Lock() processes = [] rows_per_worker = height // num_workers # Spawn worker processes for i in range(num_workers): start_row = i * rows_per_worker end_row = height if i == num_workers - 1 else (i + 1) * rows_per_worker p = Process( target=process_chunk, args=(shm.name, image.shape, image.dtype, start_row, end_row, lock) ) processes.append(p) p.start() # Wait for all processes to complete for p in processes: p.join() # Copy result back result = shared_img.copy() # Cleanup shm.close() shm.unlink() return result # Usage if __name__ == "__main__": # Create a test image (1920x1080 RGB) test_image = np.random.randint(0, 256, (1080, 1920, 3), dtype=np.uint8) start = time.time() result = parallel_grayscale(test_image, num_workers=4) elapsed = time.time() - start print(f"Processed {test_image.size:,} pixels in {elapsed:.2f}s")
💡 Key Design Insight: This example avoids locks during processing by giving each worker a non-overlapping region. The only lock is for coordinating print statements. This pattern—partitioning work to avoid shared writes—is the most efficient way to use shared memory.
Key Takeaways
-
Shared memory is fast but dangerous: Zero-copy access means no serialization overhead, but you're responsible for synchronization.
-
Always synchronize writes: Use Lock, RLock, or Condition when multiple processes write to the same memory region.
-
Partition work when possible: The fastest approach is giving each process non-overlapping regions, eliminating the need for locks.
-
Clean up properly: Every process must call
close(), and the creator must callunlink()to prevent resource leaks. -
Use context managers: Wrap shared memory access in try/finally or custom context managers for automatic cleanup.
-
NumPy integration is powerful: Backing NumPy arrays with shared memory enables efficient parallel numerical computing.
-
Test for race conditions: Run with many iterations—race conditions are non-deterministic and may not appear in short tests.
