Write Checkpoints periodically during training and only from one device (4, 7c)

Major code modifications are highlighted with comments at the end of the function below. Note, we implement checkpointing a little differently in this script. Below we save the most recent checkpoint only if it reaches a minimum validation accuracy. That way we will always have the most accurate model at the end of training.

 
© Chishiki-AI  |   Cornell University    |   Center for Advanced Computing    |   Copyright Statement    |   Access Statement
CVW material development is supported by NSF OAC awards 1854828, 2321040, 2323116 (UT Austin) and 2005506 (Indiana University)