Softmax operation converts MLP’s output to possibility. A commonly used loss for output via softmax regression is cross-entropy loss. Cross-entropy loss measures the variation between the probability distribution for the true class distribution and the predicted class distribution. Formulas below define softmax regression and the corresponding cross-entropy loss:

Softmax regression:

\(p_i = (\text{softmax}(x))_i = \frac{exp(x_i)}{\sum_{j=1}^c exp(x_j)}\)

Cross-entropy loss:

\(L(x, y) = -\sum_{i=1}^ct_ilog(p_i)\)

Where we define \(t_i\) and \(p_i\) as:

\(t_i\): true class distribution
\(p_i\): predicted class distribution

In softmax regression, each neuron in the output layer generates the possibility for the corresponding class. The one with the highest possibility is the prediction label generated by the model.

 
© Chishiki-AI  |   Cornell University    |   Center for Advanced Computing    |   Copyright Statement    |   Access Statement
CVW material development is supported by NSF OAC awards 1854828, 2321040, 2323116 (UT Austin) and 2005506 (Indiana University)