Next, we need to build the architecture of our CNN classfier. In this tutorial, we will utilize transfer learning. It is rare for people to build CNN from scratch as we can leverage a CNN model that was trained with very large datasets and apply that knowledge to our use case as is done in transfer learning.

ResNet

In this tutorial we will start with the resnet18 model trained with the ImageNet dataset. Below we instantiate this model and load the pretrained weights.

Transfer Learning

With transfer learning there are two common ways we can utilize previously optimized weights for specific CNN architectures:

  • Start the optimization process of model at the previously optimized weight instead of random weights. This will accelerate the training process of the entire network.
  • Use the previously optimized weights as a fixed feature extractor. That is, we freeze the previously learned weights except that of the last fully connected layers.

In this tutorial, we will use ResNet18 as a fixed feature extractor. Let’s start by freezing all the weights in our network:

Note, if we wanted to fine tune our entire model, we could skip the above step.

Then, we can add a new final fully connected layer. These parameters will, by default in PyTorch, not be frozen when we replace the final fully connected layers and will be learned in the training process.

 
© Chishiki-AI  |   Cornell University    |   Center for Advanced Computing    |   Copyright Statement    |   Access Statement
CVW material development is supported by NSF OAC awards 1854828, 2321040, 2323116 (UT Austin) and 2005506 (Indiana University)