Multiprocessing in Python

Nov 24, 2018·

Dr. Ben Mather

· 4 min read

Most of the codes I develop run in parallel using MPI (Message Passing Interface) using the python wrapper, mpi4py. There is a reason why highly scalable programs use this approach, and that is because each processor handles its own chunk of memory and communicates with other processors only when it’s needed. PETSc, for example, is a behemoth computing framework entirely written in the MPI computing philosophy. Despite MPI’s efficiency, there are some barriers:

MPICH or OpenMPI must be already compiled on the system
Python needs mpi4py to communicate in parallel
Programming with MPI is sometimes extremely tedious
The problem may not benefit from distributed memory

If any of these are true, then OpenMP is a simple alternative. The OpenMP philosophy is to use a “master” process to spawn a bunch of worker processes that each share the same memory allocation. Clearly, this is approach is unsuitable if computation spans more than one computation node. On the positive side, most programming languages already have some implementation of OpenMP without the need for additional software. We explore such an implementation withihn the multiprocessing module in Python.

Multiprocessing module

There are 2 main objects in the multiprocessing module, which can be imported as:

from multiprocessing import Pool, Queue

I have found Queue to be the most intuitive. It sets up a queue of tasks to be executed by each processor. When one task completes, the free processor takes another task from the queue until there are no tasks remaining.

from multiprocessing import Queue, Process, cpu_count

n = n_evaluations
processes = []
q_in = Queue(1)
q_in = Queue()

nprocs = cpu_count()

# initialise the processes
for i in range(nprocs):
    p = Process(target=function, args=args, kwargs=kwargs)
    processes.append(p)

for p in processes:
    p.daemon = True
    p.start()

# put items in the queue
sent = [q_in.put((i, var)) for i in range(n)]

# get the results
for i in range(len(sent)):
    i, res = q_out.get()

# wait until each processor has finished
[p.join() for p in processes]

Note

Set nprocs to the number of processes (the number of CPUs) and pass i to the queue to reconstruct the ordering of the results.

In the above example, the Queue object can only be defined from the __main__ namespace. However, after some tinkering I got this to work within a Python class:

from multiprocessing import Queue, Process, cpu_count

class UptownFunc:
    def __init__(self):
        pass

    def _func_queue(self, func, q_in, q_out, *args, **kwargs):
        """ Retrive processes from the queue """
        while True:
            pos, var = q_in.get()
            if pos is None:
                break

            res = func(var, *args, **kwargs)
            q_out.put((pos, res))
        return

    def parallelise_function(self, var, func, *args, **kwargs):
        """ Split evaluations of func across processors """
        n = len(var)

        processes = []
        q_in = Queue(1)
        q_out = Queue()

        nprocs = cpu_count()

        for i in range(nprocs):
            pass_args = [func, q_in, q_out]
            # pass_args.extend(args)

            p = Process(target=self._func_queue,\
                        args=tuple(pass_args),\
                        kwargs=kwargs)

            processes.append(p)

        for p in processes:
            p.daemon = True
            p.start()

        # put items in the queue
        sent = [q_in.put((i, var[i])) for i in range(n)]
        [q_in.put((None, None)) for _ in range(nprocs)]

        # get the results
        results = [[] for i in range(n)]
        for i in range(len(sent)):
            index, res = q_out.get()
            results[index] = res

        # wait until each processor has finished
        [p.join() for p in processes]

        # reorder results
        return results

And that’s it! Pass any function to parallelise_function and supply a list of var to evaulate in parallel along with optional arguments and keywords. As an example, run the code with this which squares a list of numbers in parallel and returns them in order:

def square(a):
    return a**2

nprocs = cpu_count()
a = list(range(0, nprocs))
print(a)

P = UptownFunc()
results = P.parallelise_function(a, square)
print(results)

If this code looks familiar it’s because I’ve taken it directly from PyCurious: a tool to calculate the Curie depth from windows of the magnetic anomaly. The above code snippet parallelises the computation of Curie depth across each window.

Last updated on Mar 4, 2026

Python Parallel Programming

Authors

Dr. Ben Mather

ARC Industry Research Fellow

I am an ARC Industry Research Fellow in the School of Geography, Earth and Atmospheric Sciences at The University of Melbourne. I am an expert in fusing Earth evolution models with data to understand how groundwater moves critical minerals through the landscape. Related research interests include the cycling of volatiles within the Earth, probabilistic thermal models of the lithosphere to unravel past tectonic and climatic events, and understanding the how enigmatic volcanoes form.

I am a vocal advocate for the integral role of geoscience in responding to challenges we face in transitioning to the carbon-neutral economy. As an expert in my field, I have been interviewed in national and international print media, TV, and radio on a wide variety of subjects including earthquakes, volcanoes, groundwater, and critical minerals.

← May the 4th be with you... May 4, 2019

The effect of palaeoclimate on heat flow data Nov 20, 2018 →

No results found

Multiprocessing in Python

Multiprocessing module