ML/AI
Machine Learning on Vista
The Vista User Guide, in its Machine Learning section, offers detailed guides on using PyTorch as well as Transformers and Accelerate from Hugging Face. In both cases, training can be run on one single Grace Hopper compute node or scaled across multiple GH nodes.
Of course, you are free to install and run ML packages other than these two, according to your own needs and preferences. To take full advantange of Vista, though, you will need to understand how to set up your software such that it can use the GPUs on multiple GH nodes.
Note that when you install Python packages that are to be used across multiple compute nodes, it should be done in a virtual environment that you have created under the $SCRATCH directory rather than your $HOME directory. This prevents undue stress on the file system containing $HOME.
CVW material development is supported by NSF OAC awards 1854828, 2321040, 2323116 (UT Austin) and 2005506 (Indiana University)