This interactive demo illustrates how a combination of linear transformation and non-linearity can transform data in a way that linear transformations alone cannot. Observe how the data, initially not linearly separable in the input space (X), becomes separable after passing through a linear layer (Y) and then a non-linear activation (Z).

This provides intuition for why layers with non-linear activations are powerful: they can map data into a new space where complex patterns become simpler (potentially linearly separable), making them learnable by subsequent layers.

Adjust the sliders to see how a linear (rotation + scaling) and non-linear (ReLU) transformation can make data separable. Or, press "Solve" to see a working solution.

Mathematical Transformations

1. Linear Transformation:
Y = WTX + b
Where W is the transformation matrix (rotation + scaling) and b is the bias (set to 0 here)
2. Non-linear Transformation (Leaky ReLU):
Z = f(Y) = max(αY, Y)
Where α is the negative slope parameter: 1.00
Applied element-wise: f(y) = y if y > 0, else α × y
1. Original Data (Input Space X)
2. After Linear Transform (Y = WTX)
3. After Non-linearity (Z = f(Y))
 
©  |   Cornell University    |   Center for Advanced Computing    |   Copyright Statement    |   Access Statement
CVW material development is supported by NSF OAC awards 1854828, 2321040, 2323116 (UT Austin) and 2005506 (Indiana University)