Convolutional Neural Network (CNN)
From MLP to CNN
MLP can be trained to classify a simple image dataset, such as the MNIST dataset which contains handwritten digits, with each image having 28 pixels in width, 28 pixels in height, and 1 greyscale channel. However, as for more complex image recognition tasks, its performance is not so good. The main reason is that the fully connected layers fail to learn the spatial information. Objects of a particular class usually have specific pixel patterns gathered. When the object moves to another place in the image, a model should still be able to recognize it, even if it might not have seen such a training image before.

Image source: [1]
References:
©
Chishiki-AI
|
Cornell University
|
Center for Advanced Computing
|
Copyright Statement
|
Access Statement
CVW material development is supported by NSF OAC awards 1854828, 2321040, 2323116 (UT Austin) and 2005506 (Indiana University)
CVW material development is supported by NSF OAC awards 1854828, 2321040, 2323116 (UT Austin) and 2005506 (Indiana University)