Parallel Python
Chris Myers, Andrew Dolgert (original author)
Cornell Center for Advanced Computing
Revisions: 6/2023, 5/2020, 8/2018, 6/2015, 5/2011 (original)
Parallel processing is a powerful approach to improving run time performance, although one needs to (a) identify how best to parallelize any particular application, (b) navigate the tradeoffs between computation and communication, and (c) understand the implications of parallelism for code structure and development. The Python ecosystem provides a number of tools to support parallel, concurrent and distributed computations, although — as in the case of tools for building hybrid interpreted-compiled programs — the landscape is constantly in flux. Multiprocessing support is available through the standard library. Interfaces to the MPI message passing library are provided by a variety of packages (one of which, mpi4py, we will address here). Finally, other third-party libraries supporting parallel computation at various levels of granularity continue to be developed, with some (such as Dask) supporting close integration with other libraries in the Python scientific computing ecosystem such as NumPy, Pandas and Scikit-learn.
Objectives
After completing this topic, you should be able to:
- Discuss general approaches to multiprocessing and parallelism in Python code
- Use the multiprocessing module in the Python Standard Library
- Use the mpi4py package to use the MPI message passing library from within Python
- Identify other tools in the Python ecosystem to assist with multiprocessing and parallelism
Prerequisites
As this topic focuses on accelerating Python programs for scientific computing, it implicitly assumes the reader has some prior experience programming in Python, as well as working knowledge of general programming concepts. The target audience is scientists and engineers who are already programming in Python, and are interested in using Python tools and packages to improve the run time performance of the programs they are developing. If additional introductory material about Python is needed, readers can consult Introduction to Python Programming as well as the documentation on the python.org website.