AI in Science
While AI is mostly receiving a lot of public attention due to the availability of applications that can generate text, images, and other content, the underlying suite of tools and techniques from machine learning (ML) and deep learning (DL) are also increasingly being integrated into the fabric of scientific research in a number of interesting and innovative ways. This includes the use of specific ML and DL techniques to address important problems in science and engineering, as well as broader methodological endeavors that are subsumed under terms such as "Scientific Machine Learning", "Scientific AI", and "AI for Science".
Many uses of machine learning in the physical sciences were summarized in a comprehensive 2019 review article. A more recent review article summarized many of the important trends concerning scientific discovery in the age of artificial intelligence. This involves not just using these tools to analyze data and make predictive models (which we will discuss more on the following pages), but also integrating such tools into the fabric of the scientific research process, broadly construed. That second review emphasized the following elements:
- AI-aided data collection and curation for scientific research
- Data selection
- Data annotation
- Data generation
- Data refinements
- Learning meaningful representations of scientific data
- Geometric priors
- Geometric deep learning
- Self-supervised learning
- Language modelling
- Transformer architectures
- Neural operators
- AI-based generation of scientific hypotheses
- Black-box predictors of scientific hypotheses
- Navigating combinatorial hypothesis spaces
- Optimizing differentiable hypothesis spaces
- AI-driven experimentation and simulation
- Efficient evaluation of scientific hypotheses
- Deducing observables from hypotheses using simulations
Further integration of AI into the scientific process will surely continue. In many cases, insights and techniques from one problem domain can be productively recast for use in another domains. In one such example, researchers have been able to leverage techniques from Natural Language Processing (NLP) to build protein language models and chemical language models that are able to generate candidate sequences and chemical structures for proteins and small molecules, so as to accelerate the process of developing new molecules for biological and medical therapies.