Patterns in data
Historically, much of the process of science has involved identifying patterns in data, which then provide insights into the construction of models and theories. Brahe collects detailed astronomical data about the motion of planets; Kepler analyzes those data to reveal that the planets revolve around the sun in elliptical orbits; Newton uses Kepler's findings to produce a synthesis of gravity and mechanics that explains how these elliptical orbits arise mechanistically.
Once patterns in data are identified, they can be used for other purposes. One interested in detecting fraud in bank transactions, for example, might be able to develop a statistical model of anomalies that can be flagged for further investigation. Identification of bird species by processing audio recordings of their songs might work by characterizing a "fingerprint" unique to each species, and then trying to match recordings of unknown birds to known fingerprints. Much of the field of signal processing, as well as earlier work in machine learning, focused on this sort of feature identification or feature engineering: doing the hard work of scientific experimentation and data modeling to be able to identify data features that are important for characterizing some process of interest.
In our current era of big data, computational methods are becoming indispensable to finding such patterns. And the move towards deep learning has in many ways altered the processes by which we work with data. As noted previously, researchers are increasingly able to offload computation to deep learning tools to learn about important features in data, rather than requiring us to first identify those features and encode them specifically in a pipeline for data characterization.
In supervised learning problems, such as image identification, work has been done to identify how successive layers of deep neural networks are able to progressively learn about recurring patterns in data, and then combine them into more complex representations.
In unsupervised machine learning, deep learning methods can play an important role in helping us to distill large, high-dimensional datasets down to simpler representations. Big data are sometimes not as big as they seem, in that even though data might be embedded in a large number of dimensions due to the large number of data features captured, they often lie on some lower-dimensional subspace or manifold due to correlations or other sorts of relationships among those features. Methods for developing low-dimensional representations of data, such as varational autoencoders that project data down to a latent space, can be effective in such distillations. Such methods are complemented by approaches that systematize the characterization of low-dimensional structure in models and theories built from data.