Neural Networks
Neural networks are flexible computational models that can process inputs and produce outputs, with parameters that can be fit or estimated based on data. They represent a type of computational graph inspired by the action of neurons in the brain. The structure of these models reflects a directed network of artificial "neurons" or nodes, in which data are presented as inputs to a network, those inputs are fed through the neural network architecture — which includes one or more hidden layers — and which ultimately produces outputs. This general architecture is represented schematically in the left panel of the figure below. It is conventional to represent neural networks as flowing from left to right, with the input nodes on the far left and the output nodes on the far right. Within the illustration of an entire neural network, we highlight a subset of nodes and edges in red.
Just as the network as a whole includes a set of inputs and outputs, so does each individual node in the network, as illustrated in the right panel of the figure below. Each edge that connects two nodes carries with it an associated weight, indicating the strength of influence between the two nodes. In the network depicted below, for each node

The action of the network is parameterized by the edge weights
In a supervised learning problem, each input has associated with it a corresponding label or value that is to be learned by the network. The input layer encodes the data elements that are to be trained on, and the output layers produce the predictions of the associated labels or values. The job of the learning algorithm is to tune the parameters of the neural network to do a good job of associating inputs with outputs, and doing so in a generalizable fashion so that it can make accurate predictions about labels associated with new data that have not been trained on. In an unsupervised learning problem, patterns in data are to be learned, without any explicit labels on the data. These patterns might consist of clusters of similar data points or a reduction to a lower dimensional space that allows for easier interpretation. One common approach for unsupervised learning using deep learning is to construct an autoencoder that compresses the inputs to a lower-dimensional space and then attempts to reconstruct the original data using that lower-dimensional representation. Structure detected in the lower-dimensional embedding space can serve as a useful summary of patterns in the original data.
Designing an appropriate network architecture for a given problem is itself an art and a science, the details of which are beyond the scope of this tutorial. For certain classes of problems, however, there are architectural elements that are regularly used, such as convolutional layers for image processing applications and transformer architectures for large language models. In the page on Models and Predictions, we describe these in the context of models trained on large datasets.