Lab: CNN with Multi-Node Multi-GPU
Instructions:
Tips:
If you are running the notebook and encounter an error with mpirun
, double check that you started a TAP session with 2 Nodes and 2 Tasks and are using the default kernel.
This notebook will use the same hyperparameters as used in part 1:
- Learning Rate (lr): how much model parameters are updated at each batch/epoch
- Batch Size: number of data points used to estimate gradients at each iteration
- Epochs: Number of times to iterate over our entire dataset in optimization process
These hyperparameters will be used throughout the notebook.
©
Chishiki-AI
|
Cornell University
|
Center for Advanced Computing
|
Copyright Statement
|
Access Statement
CVW material development is supported by NSF OAC awards 1854828, 2321040, 2323116 (UT Austin) and 2005506 (Indiana University)
CVW material development is supported by NSF OAC awards 1854828, 2321040, 2323116 (UT Austin) and 2005506 (Indiana University)