Multiprocessing in Python
Most of the codes I develop run in parallel using MPI (Message Passing Interface) using the python wrapper, mpi4py. There is a reason why highly scalable programs use this approach, and that is because each processor handles its own chunk of memory and communicates with other processors only when it’s needed. PETSc, for example, is a behemoth computing framework entirely written in the MPI computing philosophy. Despite MPI’s efficiency, there are some barriers:
- MPICH or OpenMPI must be already compiled on the system
- Python needs
mpi4pyto communicate in parallel - Programming with MPI is sometimes extremely tedious
- The problem may not benefit from distributed memory
If any of these are true, then OpenMP is a simple alternative. The OpenMP philosophy is to use a “master” process to spawn a bunch of worker processes that each share the same memory allocation. Clearly, this is approach is unsuitable if computation spans more than one computation node. On the positive side, most programming languages already have some implementation of OpenMP without the need for additional software. We explore such an implementation withihn the multiprocessing module in Python.
Multiprocessing module
There are 2 main objects in the multiprocessing module, which can be imported as:
from multiprocessing import Pool, Queue
I have found Queue to be the most intuitive. It sets up a queue of tasks to be executed by each processor. When one task completes, the free processor takes another task from the queue until there are no tasks remaining.
from multiprocessing import Queue, Process, cpu_count
n = n_evaluations
processes = []
q_in = Queue(1)
q_in = Queue()
nprocs = cpu_count()
# initialise the processes
for i in range(nprocs):
p = Process(target=function, args=args, kwargs=kwargs)
processes.append(p)
for p in processes:
p.daemon = True
p.start()
# put items in the queue
sent = [q_in.put((i, var)) for i in range(n)]
# get the results
for i in range(len(sent)):
i, res = q_out.get()
# wait until each processor has finished
[p.join() for p in processes]
nprocs to the number of processes (the number of CPUs) and pass i to the queue to reconstruct the ordering of the results.In the above example, the Queue object can only be defined from the __main__ namespace. However, after some tinkering I got this to work within a Python class:
from multiprocessing import Queue, Process, cpu_count
class UptownFunc:
def __init__(self):
pass
def _func_queue(self, func, q_in, q_out, *args, **kwargs):
""" Retrive processes from the queue """
while True:
pos, var = q_in.get()
if pos is None:
break
res = func(var, *args, **kwargs)
q_out.put((pos, res))
return
def parallelise_function(self, var, func, *args, **kwargs):
""" Split evaluations of func across processors """
n = len(var)
processes = []
q_in = Queue(1)
q_out = Queue()
nprocs = cpu_count()
for i in range(nprocs):
pass_args = [func, q_in, q_out]
# pass_args.extend(args)
p = Process(target=self._func_queue,\
args=tuple(pass_args),\
kwargs=kwargs)
processes.append(p)
for p in processes:
p.daemon = True
p.start()
# put items in the queue
sent = [q_in.put((i, var[i])) for i in range(n)]
[q_in.put((None, None)) for _ in range(nprocs)]
# get the results
results = [[] for i in range(n)]
for i in range(len(sent)):
index, res = q_out.get()
results[index] = res
# wait until each processor has finished
[p.join() for p in processes]
# reorder results
return results
And that’s it! Pass any function to parallelise_function and supply a list of var to evaulate in parallel along with optional arguments and keywords. As an example, run the code with this which squares a list of numbers in parallel and returns them in order:
def square(a):
return a**2
nprocs = cpu_count()
a = list(range(0, nprocs))
print(a)
P = UptownFunc()
results = P.parallelise_function(a, square)
print(results)
If this code looks familiar it’s because I’ve taken it directly from PyCurious: a tool to calculate the Curie depth from windows of the magnetic anomaly. The above code snippet parallelises the computation of Curie depth across each window.

I am an ARC Industry Research Fellow in the School of Geography, Earth and Atmospheric Sciences at The University of Melbourne. I am an expert in fusing Earth evolution models with data to understand how groundwater moves critical minerals through the landscape. Related research interests include the cycling of volatiles within the Earth, probabilistic thermal models of the lithosphere to unravel past tectonic and climatic events, and understanding the how enigmatic volcanoes form.
I am a vocal advocate for the integral role of geoscience in responding to challenges we face in transitioning to the carbon-neutral economy. As an expert in my field, I have been interviewed in national and international print media, TV, and radio on a wide variety of subjects including earthquakes, volcanoes, groundwater, and critical minerals.