High Performance Python
Python is an interpreted programming language, so it might be expected to have some limitations in computational performance as compared to compiled languages. Nonetheless, Python has long been popular in the scientific computing community for a number of reasons:
- Because it is lightweight and expressive, Python is useful for small scripts that process data, coordinate analyses, and "glue" together different tools.
- Because it is object-oriented, Python is useful for the construction of large software systems that capture the abstractions and logic of complex scientific and algorithmic concepts.
- Because it is readily extensible to support the integration of compiled code modules into the interpreted environment, Python enables both high-level program control and low-level numerical performance.
- Because it is continually expanded upon and improved by an enthusiastic user community, Python has grown into a rich ecosystem of libraries and tools for scientific computing that helps to lower the barriers for individual scientists to develop complex computational workflows and analyses.
All things being equal, everyone would like to have their codes run faster. This is especially the case for scientific computations, which often run for long times on high-performance resources. For code written in Python, there are three broad approaches to improving computational performance:
- Integrating compiled functions and libraries into your interpreted Python programs
- Writing faster pure Python
- Running code in parallel
In this section, we will address each of these three approaches. The first topic — integrating compiled code into interpreted Python code — will itself be broken down into two broad approaches:
- Calling on efficient, compiled numerical libraries, such as NumPy and SciPy
- Creating hybrid codes by integrating your own custom compiled code with Python
The first of these involves crafting algorithms around the data structures and functions provided by existing third-party libraries. The latter involves interleaving Python code with your own compiled code, either in small bits, or as entire functions, libraries or programs which you wrap to make them callable from within Python.
The second main approach — writing faster pure Python — provides tips to ensure that computational resources are used efficiently within the Python interpreter.
The third main approach describes some of the tools available in Python for leveraging a tried-and-true route to enhanced performance, namely the use of multiple computational resources in parallel to solve problems more quickly.
In addition to the topics mentioned above, we will also address two related issues:
- Using profiling and timing tools to assess performance, compare different implementations, and optimize code hotspots
- Using Python effectively on high-performance computational resources such as Frontera at TACC
If the rest of this material is already familiar to you, and you just want information about to use Python and associated libraries on Frontera, you can proceed directly to the section on Python at TACC.