Python Shared Memory

What Shared Memory Changes

Normal multiprocessing sends data between isolated process address spaces. A queue or pipe usually pickles the Python object, copies bytes through an IPC channel, and rebuilds an object on the other side.

multiprocessing.shared_memory changes the data path. The operating system owns one named byte region, and each Python process maps that same region into its own address space. The mapping addresses can differ, but reads and writes land on the same underlying bytes.

That trade is powerful and sharp: shared memory removes serialization for the shared payload, but it also removes the safety of private process memory. The shared block is just bytes. You must decide the layout, synchronize writes, and clean up the lifetime.

Loading visualization...

Creating and Using Shared Memory

The SharedMemory class creates a named block of bytes. Let Python generate the name unless you have a strong reason to coordinate a fixed name, then pass shm.name to the process that should attach.

from multiprocessing import shared_memory

message = b"Hello from Process A!"
shm = shared_memory.SharedMemory(create=True, size=len(message))

try:
    shm.buf[:len(message)] = message

    # Pass shm.name to another process. That process can attach by name.
    reader = shared_memory.SharedMemory(name=shm.name)
    try:
        received = bytes(reader.buf[:len(message)])
        print(received.decode("utf-8"))
    finally:
        reader.close()
finally:
    shm.close()
    shm.unlink()

Memory Layout of SharedMemory

A SharedMemory object wraps a raw block of bytes allocated by the operating system. The buf attribute is a memoryview over that block, so every process must agree on the byte layout before reading or writing structured data.

Loading visualization...

The Danger: Race Conditions

The biggest danger with shared memory is a race condition: two or more processes access the same shared bytes concurrently, and at least one access is a write. Unlike regular Python objects inside one interpreter, there is no Global Interpreter Lock protecting shared memory across processes.

The common lost-update bug is read, modify, write. Process A and Process B both read the old value, both compute a new value, and the last writer overwrites the other result.

Loading visualization...

Using SharedMemory with NumPy

NumPy arrays work well with shared memory because an ndarray can use shm.buf as its storage. The critical rule is byte sizing: allocate array.nbytes or prod(shape) * dtype.itemsize, and pass the shape and dtype alongside the shared memory name.

import numpy as np
from multiprocessing import shared_memory

def create_shared_array(shape, dtype=np.float64):
    dtype = np.dtype(dtype)
    size = int(np.prod(shape)) * dtype.itemsize
    shm = shared_memory.SharedMemory(create=True, size=size)
    try:
        arr = np.ndarray(shape, dtype=dtype, buffer=shm.buf)
        return arr, shm
    except Exception:
        shm.close()
        shm.unlink()
        raise

def attach_shared_array(name, shape, dtype=np.float64):
    shm = shared_memory.SharedMemory(name=name)
    try:
        arr = np.ndarray(shape, dtype=np.dtype(dtype), buffer=shm.buf)
        return arr, shm
    except Exception:
        shm.close()
        raise

arr, shm = create_shared_array((1000, 1000))
try:
    arr[:] = np.random.rand(1000, 1000)
    print(f"Shared memory name: {shm.name}")
    print(f"Array sum: {arr.sum():.2f}")
finally:
    shm.close()
    shm.unlink()

Loading visualization...

Safe Synchronization Methods

Shared memory only shares bytes. It does not make compound operations atomic. Use synchronization primitives when multiple processes can write the same region, or partition the shared block so each worker owns a non-overlapping slice.

Loading visualization...

Memory Management: close() vs unlink()

Proper cleanup is critical because a shared memory block may outlive the process that created it. Every process should call close() when it no longer needs its mapping. unlink() should be called once per shared memory block when the block is no longer needed. close() and unlink() can be called in either order, but accessing the buffer after close() is invalid, and accessing data after unlink() can fail depending on platform.

Owner

Creates the block and is responsible for unlinking it once.

Participant

Attaches by name and always closes its own handle.

Manager

Use SharedMemoryManager when lifetime ownership should be centralized.

Loading visualization...

Best Practice: Context Manager Pattern

from contextlib import contextmanager
from multiprocessing import shared_memory

@contextmanager
def managed_shared_memory(name=None, create=False, size=0, *, track=True):
    """Context manager for one process's shared memory handle."""
    shm = None
    try:
        if create:
            shm = shared_memory.SharedMemory(
                create=True,
                size=size,
                name=name,
                track=track,
            )
        else:
            shm = shared_memory.SharedMemory(name=name, track=track)
        yield shm
    finally:
        if shm is not None:
            shm.close()
            if create:
                try:
                    shm.unlink()
                except FileNotFoundError:
                    pass

with managed_shared_memory(create=True, size=1024) as shm:
    shm.buf[0:5] = b"Hello"
    print(f"Name: {shm.name}")

Common Pitfalls and Solutions

Pitfall	Problem	Fix
No synchronization	Two processes can read the same old value and overwrite each other.	Use a Lock/RLock/Condition, or partition memory so writers never touch the same bytes.
Leaking blocks	A POSIX shared memory block can outlive the creating process.	Call close() in every process and unlink() once when the block is no longer needed.
Independent resource trackers	Standalone Python processes can each create a tracker; the first exit may unlink the block.	Use SharedMemoryManager, a single owner process, or track=False when another owner handles cleanup.
Wrong byte size	The NumPy shape and dtype may require more bytes than the block contains.	Allocate array.nbytes or prod(shape) * dtype.itemsize and pass shape/dtype with the name.
Using the buffer after close	The memoryview becomes invalid for that process.	Copy any result you need before closing the SharedMemory handle.

Pitfall

No synchronization

Problem

Two processes can read the same old value and overwrite each other.

Fix

Use a Lock/RLock/Condition, or partition memory so writers never touch the same bytes.

Pitfall

Leaking blocks

Problem

A POSIX shared memory block can outlive the creating process.

Fix

Call close() in every process and unlink() once when the block is no longer needed.

Pitfall

Independent resource trackers

Problem

Standalone Python processes can each create a tracker; the first exit may unlink the block.

Fix

Use SharedMemoryManager, a single owner process, or track=False when another owner handles cleanup.

Pitfall

Wrong byte size

Problem

The NumPy shape and dtype may require more bytes than the block contains.

Fix

Allocate array.nbytes or prod(shape) * dtype.itemsize and pass shape/dtype with the name.

Pitfall

Using the buffer after close

Problem

The memoryview becomes invalid for that process.

Fix

Copy any result you need before closing the SharedMemory handle.

Complete API Reference

`SharedMemory(name=None, create=False, size=0, *, track=True)`

track was added in Python 3.13. On Unix-like systems it controls whether the shared memory block is registered with Python's resource tracker. Processes created through multiprocessing share the same tracker, which is usually what you want. Independent Python processes or subprocess trees can each get their own tracker; with track=True, the first process to exit can remove the shared block while another process still expects it. In that case, use one owner process or SharedMemoryManager, or set track=False when another process is already responsible for cleanup.

On Windows, track is ignored because Windows deletes shared memory after all handles are closed.

API	Purpose	Detail
SharedMemory(...)	Create or attach to a named shared byte block.	Python 3.13 signature is SharedMemory(name=None, create=False, size=0, *, track=True).
shm.buf	memoryview over the shared bytes.	Use explicit slicing and encoding/decoding; do not access after close().
shm.name	Generated or user-provided identifier for the block.	Pass this value to other processes so they can attach to the same bytes.
shm.size	Total bytes available through this handle.	May be rounded up by the operating system; still size arrays explicitly.
shm.close()	Release this process mapping/handle.	Call it in every process when that process is done with the block.
shm.unlink()	Request deletion of the shared memory block.	Call once per block. On Windows, deletion happens when all handles close.
SharedMemoryManager	Own shared blocks from a manager process.	Use it when manual lifetime ownership would be fragile.

API

SharedMemory(...)

Purpose

Create or attach to a named shared byte block.

Detail

Python 3.13 signature is SharedMemory(name=None, create=False, size=0, *, track=True).

API

shm.buf

Purpose

memoryview over the shared bytes.

Detail

Use explicit slicing and encoding/decoding; do not access after close().

API

shm.name

Purpose

Generated or user-provided identifier for the block.

Detail

Pass this value to other processes so they can attach to the same bytes.

API

shm.size

Purpose

Total bytes available through this handle.

Detail

May be rounded up by the operating system; still size arrays explicitly.

API

shm.close()

Purpose

Release this process mapping/handle.

Detail

Call it in every process when that process is done with the block.

API

shm.unlink()

Purpose

Request deletion of the shared memory block.

Detail

Call once per block. On Windows, deletion happens when all handles close.

API

SharedMemoryManager

Purpose

Own shared blocks from a manager process.

Detail

Use it when manual lifetime ownership would be fragile.

Real-World Example: Parallel Image Processing

import numpy as np
from multiprocessing import shared_memory, Process, Lock
import time

def process_chunk(shm_name, shape, dtype, start_row, end_row, lock):
    """Process a non-overlapping chunk of the image."""
    shm = shared_memory.SharedMemory(name=shm_name)
    try:
        img = np.ndarray(shape, dtype=dtype, buffer=shm.buf)

        for row in range(start_row, end_row):
            for col in range(shape[1]):
                r, g, b = img[row, col, 0], img[row, col, 1], img[row, col, 2]
                gray = int(0.299 * r + 0.587 * g + 0.114 * b)
                img[row, col] = [gray, gray, gray]

        with lock:
            print(f"Processed rows {start_row}-{end_row}")
    finally:
        shm.close()

def parallel_grayscale(image, num_workers=4):
    """Convert image to grayscale using multiple processes."""
    height, width, channels = image.shape
    shm = shared_memory.SharedMemory(create=True, size=image.nbytes)

    try:
        shared_img = np.ndarray(image.shape, dtype=image.dtype, buffer=shm.buf)
        shared_img[:] = image

        lock = Lock()
        processes = []
        rows_per_worker = height // num_workers

        for i in range(num_workers):
            start_row = i * rows_per_worker
            end_row = height if i == num_workers - 1 else (i + 1) * rows_per_worker

            p = Process(
                target=process_chunk,
                args=(shm.name, image.shape, image.dtype, start_row, end_row, lock),
            )
            processes.append(p)
            p.start()

        for p in processes:
            p.join()

        failures = [p.exitcode for p in processes if p.exitcode != 0]
        if failures:
            raise RuntimeError(f"worker failure exit codes: {failures}")

        return shared_img.copy()
    finally:
        shm.close()
        shm.unlink()

if __name__ == "__main__":
    test_image = np.random.randint(0, 256, (1080, 1920, 3), dtype=np.uint8)

    start = time.time()
    result = parallel_grayscale(test_image, num_workers=4)
    elapsed = time.time() - start

    print(f"Processed {test_image.size:,} pixels in {elapsed:.2f}s")

This example avoids locks during pixel processing by giving each worker a non-overlapping region. The only lock coordinates print output. Partitioning work so writers never touch the same bytes is the fastest way to use shared memory safely.

Key Takeaways

Shared memory is fast but sharp: Zero-copy access removes serialization overhead, but you are responsible for layout, synchronization, and cleanup.
Always synchronize shared writes: Use Lock, RLock, or Condition when multiple processes can write the same bytes.
Partition work when possible: Giving each process non-overlapping regions avoids lock contention.
Clean up deliberately: Every process closes its own handle, and exactly one owner unlinks the block.
Understand track=True: Python 3.13 resource tracking helps multiprocessing families, but independent process trees need a clear owner or track=False.
NumPy integration is powerful: Backing arrays with shared memory enables efficient parallel numerical computing when shape, dtype, and byte size match.
Test race-prone code under load: Race conditions are non-deterministic and often disappear in short manual runs.