Libraries
Many of the packages in the Python scientific computing ecosystem are pre-installed on Frontera in an associated site-packages directory, and as such, are immediately available for import for each of the installations listed above. This includes numpy, scipy, pandas, sklearn (scikit-learn), matplotlib, cython, mpmath, numexpr, sympy, virtualenv, cffi, and ctypes, among others. (The mpi4py library is stored in an alternate site-packages directory separate from the main collection, which is prepended to the user's PYTHONPATH once a python3 module is loaded. Refer also to the section on mpi4py for additional information on running mpi4py jobs on Frontera.)
The specific version numbers of each of the packages will perhaps differ, as new versions are released and the libraries on Frontera are updated. To get current version numbers associated with a given installation, you could run Python code such as this (perhaps modified to get information about other libraries of interest):
Popular packages for deep learning, such as TensorFlow and PyTorch, are not pre-installed on Frontera, but can be installed by users through the creation of virtual environments. Information on that process is described in TACC documentation under Machine Learning on Frontera and in our companion roadmap on AI with Deep Learning.
Some Python packages are installed on Frontera, but are not loaded by default through the Lmod system. Inspection of the alternate site-packages directory prepended to the user's PYTHONPATH reveals that, in addition to mpi4py, there is also an installation of the h5py library, which provides read-write access to HDF5 files. To be able to import this library into Python, one must first load an associated module: module load phdf5
. And while not a library, but rather a tool to help you build Python extension modules, the SWIG package can be accessed on Frontera via module load swig
.
Installing additional third-party libraries for use with Python
Some Python packages that you might want to use on Frontera are not currently installed, but you can install them in your own user space, as described here. Among the packages discussed within this tutorial, the following packages are not currently pre-installed, and would require a separate installation: numba, dask, networkx, skimage, pillow, sqlalchemy, seaborn, bokeh, netCDF4, ipyparallel, and joblib.
Installing Python packages is facilitated by use of either pip or the setup.py file, as described in previous sections. In building new packages, pip will use the compilation arguments used to compile the Python interpreter, making use of the optimized configuration provided by TACC in building these tools in the first place — these arguments are optimally set for general workloads, and use the base SSE2 ISA and interprocedural optimization.
The basics of pip and setup.py installs on Frontera are covered in the Frontera User Guide section on Building Third-Party Software, which is reproduced here:
The $HOME/.local space that these packages would be installed in is included in your default sys.path
in Python.
If additional configuration options are necessary or desirable when building third-party packages, you might want to consult the information provided in Building for Performance on Frontera.
In addition, on Frontera, the $MKLROOT environment variable points to where the Intel compilers and libraries reside; there is additional information about compilers and available options in $MKLROOT/../documentation.