Machine learning (ML) involves the use of algorithms that can learn about patterns and structure in data, without being specifically instructed about the details of those patterns. Machine Learning is a major component in the broad arena of Artificial Intelligence (AI), although there are aspects of AI that do not involve ML. Deep learning (DL) is generally considered to be a subfield of Machine Learning, in that it focuses more specifically on the use of Neural Networks (sometimes referred to as Artificial Neural Networks) to solve problems in ML. All these fields mix and mingle with elements of the broadly defined field of Data Science, although much of data science involves the human-guided — rather than machine-guided — processing of data. Machine Learning is an umbrella that comprises many different types of problems, and many different types of algorithms designed to solve those problems. Deep learning is broadly applicable to many problems arising in ML, but for some problems, so-called "classical" methods of Machine Learning might be preferable.

Very broadly speaking, ML and DL aim to "learn" how to map a set of inputs to a set of outputs — typically via repeated iteration through data with some sort of feedback to guide the learning. The inputs are often data that are presented to us, and the outputs involve predictions that we want to make about those data. In this sense, ML/DL "machines" are like the mathematical functions that we use in analysis or the computational functions that we write in software, but they are not constructed with explicit instructions about what sorts of functions to implement. Rather, ML/DL machines are built with flexible and expressive computational elements that contain parameters which can be modified to produce different mappings between inputs and outputs. At their core, learning in ML and DL is about modifying or "fitting" these parameters in order to produce useful functional mappings. Within deep learning, the flexible and expressive computational elements used to construct functions are artificial "neurons" — modeled on the action of real neurons in the brain, and wired together in large neural networks — which we will discuss in more detail in what follows.

Machine learning comprises a variety of different types of problems, such as those involving supervised learning, unsupervised learning, and reinforcement learning.

Supervised learning involves data that are labeled, with the aim of training a system to develop a mapping from the underlying data elements to their associated labels, so that predictions about new and unseen data can be made based on this mapping. Supervised learning divides generally into classification — if the data labels represent discrete categorical classes — and regression — if the data labels represent continuous numerical values. Classification involves problems such as identifying letters and digits in images of handwritten text, or distinguishing cancerous from normal cells based on their gene expression patterns. Regression involves problems such as predicting crop yield based on climate and soil conditions, or predicting stock returns based on prior performance and other economic factors.

Unsupervised learning involves data that are not labeled, with the aim of discovering patterns inherent in the data themselves. Such a discovery process is a bit more open-ended in practice: it might involve methods such as clustering — in order to identify subgroups of related items within a large dataset — or dimensionality reduction — to discover a lower-dimensional subspace or representation that some high-dimensional dataset lies in. A powerful set of techniques involve the use of autoencoders, which aim to generate efficient representations (encodings) of unlabeled data.

Reinforcement learning involves a system that changes its behavior over time, by getting feedback about what works well or does not, for a given task. Feedback is in the form of positive rewards or negative penalties associated with an agent's behavior in an environment. Strategies or configurations that work well to achieve the task are kept and further modified, while those that do not perform well tend to be discarded.

Within the confines of these broad classes of problems, there are a number of different algorithms that can be employed to carry out such learning, each with their own assumptions, biases, strengths and weaknesses. In addition, there are a number of different packages and libraries that support different ML algorithms. In this tutorial, we will focus on the use of popular deep learning packages such as TensorFlow/Keras and PyTorch.

Other packages such as scikit-learn offer excellent support for other types of classical Machine Learning algorithms. In our companion material on Python for Data Science, we describe the use of scikit-learn to solve ML problems in classification, clustering, and dimensionality reduction.

Since deep learning represents a class of algorithmic approaches that use multilayer neural networks to solve a variety of problems in machine learning, we will — in the following pages — give an overview of some of the key elements of how neural networks are used, both in terms of their underlying mathematics and their implementation in software. Readers might also be interested in reading this influential and interesting review article that highlights important concepts.

 
©  |   Cornell University    |   Center for Advanced Computing    |   Copyright Statement    |   Inclusivity Statement