What Shared Memory Changes
Normal multiprocessing sends data between isolated process address spaces. A queue or pipe usually pickles the Python object, copies bytes through an IPC channel, and rebuilds an object on the other side.
multiprocessing.shared_memory changes the data path. The operating system owns one named byte region, and each Python process maps that same region into its own address space. The mapping addresses can differ, but reads and writes land on the same underlying bytes.
That trade is powerful and sharp: shared memory removes serialization for the shared payload, but it also removes the safety of private process memory. The shared block is just bytes. You must decide the layout, synchronize writes, and clean up the lifetime.
Creating and Using Shared Memory
The SharedMemory class creates a named block of bytes. Let Python generate the name unless you have a strong reason to coordinate a fixed name, then pass shm.name to the process that should attach.
from multiprocessing import shared_memory message = b"Hello from Process A!" shm = shared_memory.SharedMemory(create=True, size=len(message)) try: shm.buf[:len(message)] = message # Pass shm.name to another process. That process can attach by name. reader = shared_memory.SharedMemory(name=shm.name) try: received = bytes(reader.buf[:len(message)]) print(received.decode("utf-8")) finally: reader.close() finally: shm.close() shm.unlink()
Memory Layout of SharedMemory
A SharedMemory object wraps a raw block of bytes allocated by the operating system. The buf attribute is a memoryview over that block, so every process must agree on the byte layout before reading or writing structured data.
The Danger: Race Conditions
The biggest danger with shared memory is a race condition: two or more processes access the same shared bytes concurrently, and at least one access is a write. Unlike regular Python objects inside one interpreter, there is no Global Interpreter Lock protecting shared memory across processes.
The common lost-update bug is read, modify, write. Process A and Process B both read the old value, both compute a new value, and the last writer overwrites the other result.
Using SharedMemory with NumPy
NumPy arrays work well with shared memory because an ndarray can use shm.buf as its storage. The critical rule is byte sizing: allocate array.nbytes or prod(shape) * dtype.itemsize, and pass the shape and dtype alongside the shared memory name.
import numpy as np from multiprocessing import shared_memory def create_shared_array(shape, dtype=np.float64): dtype = np.dtype(dtype) size = int(np.prod(shape)) * dtype.itemsize shm = shared_memory.SharedMemory(create=True, size=size) try: arr = np.ndarray(shape, dtype=dtype, buffer=shm.buf) return arr, shm except Exception: shm.close() shm.unlink() raise def attach_shared_array(name, shape, dtype=np.float64): shm = shared_memory.SharedMemory(name=name) try: arr = np.ndarray(shape, dtype=np.dtype(dtype), buffer=shm.buf) return arr, shm except Exception: shm.close() raise arr, shm = create_shared_array((1000, 1000)) try: arr[:] = np.random.rand(1000, 1000) print(f"Shared memory name: {shm.name}") print(f"Array sum: {arr.sum():.2f}") finally: shm.close() shm.unlink()
Safe Synchronization Methods
Shared memory only shares bytes. It does not make compound operations atomic. Use synchronization primitives when multiple processes can write the same region, or partition the shared block so each worker owns a non-overlapping slice.
Memory Management: close() vs unlink()
Proper cleanup is critical because a shared memory block may outlive the process that created it. Every process should call close() when it no longer needs its mapping. unlink() should be called once per shared memory block when the block is no longer needed. close() and unlink() can be called in either order, but accessing the buffer after close() is invalid, and accessing data after unlink() can fail depending on platform.
Owner
Creates the block and is responsible for unlinking it once.
Participant
Attaches by name and always closes its own handle.
Manager
Use SharedMemoryManager when lifetime ownership should be centralized.
Best Practice: Context Manager Pattern
from contextlib import contextmanager from multiprocessing import shared_memory @contextmanager def managed_shared_memory(name=None, create=False, size=0, *, track=True): """Context manager for one process's shared memory handle.""" shm = None try: if create: shm = shared_memory.SharedMemory( create=True, size=size, name=name, track=track, ) else: shm = shared_memory.SharedMemory(name=name, track=track) yield shm finally: if shm is not None: shm.close() if create: try: shm.unlink() except FileNotFoundError: pass with managed_shared_memory(create=True, size=1024) as shm: shm.buf[0:5] = b"Hello" print(f"Name: {shm.name}")
Common Pitfalls and Solutions
| Pitfall | Problem | Fix |
|---|---|---|
| No synchronization | Two processes can read the same old value and overwrite each other. | Use a Lock/RLock/Condition, or partition memory so writers never touch the same bytes. |
| Leaking blocks | A POSIX shared memory block can outlive the creating process. | Call close() in every process and unlink() once when the block is no longer needed. |
| Independent resource trackers | Standalone Python processes can each create a tracker; the first exit may unlink the block. | Use SharedMemoryManager, a single owner process, or track=False when another owner handles cleanup. |
| Wrong byte size | The NumPy shape and dtype may require more bytes than the block contains. | Allocate array.nbytes or prod(shape) * dtype.itemsize and pass shape/dtype with the name. |
| Using the buffer after close | The memoryview becomes invalid for that process. | Copy any result you need before closing the SharedMemory handle. |
Complete API Reference
SharedMemory(name=None, create=False, size=0, *, track=True)
track was added in Python 3.13. On Unix-like systems it controls whether the shared memory block is registered with Python's resource tracker. Processes created through multiprocessing share the same tracker, which is usually what you want. Independent Python processes or subprocess trees can each get their own tracker; with track=True, the first process to exit can remove the shared block while another process still expects it. In that case, use one owner process or SharedMemoryManager, or set track=False when another process is already responsible for cleanup.
On Windows, track is ignored because Windows deletes shared memory after all handles are closed.
| API | Purpose | Detail |
|---|---|---|
| SharedMemory(...) | Create or attach to a named shared byte block. | Python 3.13 signature is SharedMemory(name=None, create=False, size=0, *, track=True). |
| shm.buf | memoryview over the shared bytes. | Use explicit slicing and encoding/decoding; do not access after close(). |
| shm.name | Generated or user-provided identifier for the block. | Pass this value to other processes so they can attach to the same bytes. |
| shm.size | Total bytes available through this handle. | May be rounded up by the operating system; still size arrays explicitly. |
| shm.close() | Release this process mapping/handle. | Call it in every process when that process is done with the block. |
| shm.unlink() | Request deletion of the shared memory block. | Call once per block. On Windows, deletion happens when all handles close. |
| SharedMemoryManager | Own shared blocks from a manager process. | Use it when manual lifetime ownership would be fragile. |
Real-World Example: Parallel Image Processing
import numpy as np from multiprocessing import shared_memory, Process, Lock import time def process_chunk(shm_name, shape, dtype, start_row, end_row, lock): """Process a non-overlapping chunk of the image.""" shm = shared_memory.SharedMemory(name=shm_name) try: img = np.ndarray(shape, dtype=dtype, buffer=shm.buf) for row in range(start_row, end_row): for col in range(shape[1]): r, g, b = img[row, col, 0], img[row, col, 1], img[row, col, 2] gray = int(0.299 * r + 0.587 * g + 0.114 * b) img[row, col] = [gray, gray, gray] with lock: print(f"Processed rows {start_row}-{end_row}") finally: shm.close() def parallel_grayscale(image, num_workers=4): """Convert image to grayscale using multiple processes.""" height, width, channels = image.shape shm = shared_memory.SharedMemory(create=True, size=image.nbytes) try: shared_img = np.ndarray(image.shape, dtype=image.dtype, buffer=shm.buf) shared_img[:] = image lock = Lock() processes = [] rows_per_worker = height // num_workers for i in range(num_workers): start_row = i * rows_per_worker end_row = height if i == num_workers - 1 else (i + 1) * rows_per_worker p = Process( target=process_chunk, args=(shm.name, image.shape, image.dtype, start_row, end_row, lock), ) processes.append(p) p.start() for p in processes: p.join() failures = [p.exitcode for p in processes if p.exitcode != 0] if failures: raise RuntimeError(f"worker failure exit codes: {failures}") return shared_img.copy() finally: shm.close() shm.unlink() if __name__ == "__main__": test_image = np.random.randint(0, 256, (1080, 1920, 3), dtype=np.uint8) start = time.time() result = parallel_grayscale(test_image, num_workers=4) elapsed = time.time() - start print(f"Processed {test_image.size:,} pixels in {elapsed:.2f}s")
This example avoids locks during pixel processing by giving each worker a non-overlapping region. The only lock coordinates print output. Partitioning work so writers never touch the same bytes is the fastest way to use shared memory safely.
