Python for Deep Learning
Many freely available software packages and ecosystems exist to support the use of deep learning methods, and those offerings continue to grow at an impressive rate, providing functionality to address a wide range of problems and datasets. These packages are accessible through a variety of programming languages, but many are built with the use of the Python programming language, offering an expressive high-level application programming interface (API) linked to compiled libraries for efficient numerical performance. Python has long been used for numerical and scientific computing, and a rich ecosystem of Python libraries and tools has been developed to support a wide variety of numerical methods and algorithms. In this tutorial, we will focus on the use of two Python-based deep learning packages TensorFlow/Keras and PyTorch. We will also address some of the other software components that are used in conjunction with these packages. If you are interested in other Python tools and packages that are broadly useful for data science, please consult on companion material on Python for Data Science.
Several features of the Python language make it desirable as a substrate for deep learning libraries.
- Because Python is object-oriented, it supports the construction and interaction of complex datatypes and abstractions needed to support deep learning, such as neural networks, tensors, loss functions, optimizers, and datasets.
- Because Python is dynamically typed and interpreted, it can be run conveniently in an interactive mode during code development and testing, and can then be run in batch mode once decisions are made about what works best for a given problem.
- And because Python is extensible, compiled extension modules (written in C/C++, Fortran, etc.) can be called from within Python, enabling the co-existence of the high-level expressiveness of Python with the low-level numerical performance of compiled languages. While the user programming interfaces to deep learning libraries such as TensorFlow and PyTorch are written in Python, the number-crunching is carried out in compiled extension modules for efficient performance.
Although they differ in the details of both their APIs and their internal implementations, both TensorFlow/Keras and PyTorch provide similar sorts of functionality for carrying out deep learning with neural networks. Among other things, they provide support for:
- the construction of models from networks of any desired topology
- the specification of loss functions (including several predefined, commonly used functions)
- the numerical training of parameters from data, using backpropagation and automatic differentiation, in conjunction with a variety of different optimization algorithms, and
- the application of a model to make predictions about unseen data and assess the suitability of those predictions.
Both TensorFlow and PyTorch offer a wealth of information for new users to get started with those systems, as well as for experienced users to leverage advanced features and optimize computational performance. But that material can be somewhat daunting unless one knows precisely what they are looking for. In addition to providing guidance here on how to use such software on high-end computational clusters such as those supported by TACC, we also hope to help orient readers towards the available online documentation so that they can better take advantage of those resources.
In addition to the core packages, both TensorFlow and PyTorch are part of larger software ecosystems that support the use of those tools, or that provide additional functionality. Some aspects of these ecosystems are described in the following pages.