Cornell Virtual Workshop > Scientific Machine Learning (SciML) > Graph Neural Network > Graph Neural Networks (GNNs) and Message Passing

Graph Neural Networks (GNNs) and Message Passing

Although traditional physics simulators are powerful, there are some important drawbacks with them: (1) it is expensive and time-consuming to get high-quality results of large-scale simulations for traditional physical simulators; (2) to set up physical simulators, we need to have full knowledge of the physics parameters of the object and the environment, which are extremely hard to know in some cases.

Graph Neural Network (GNN) is a special subset of neural networks that take less structured data, such as a graph, as input, while other neural networks like Convolutional Neural Network (CNN) and Transformer, can only accept more structured data (e.g., grid and sequence). By “less structured”, it means that the input can have arbitrary shapes and sizes and can have complex topological relations.

In particle-based physics simulation, we have the unstructured position information of all the particles as the input, which inspires the idea of using a GNN.

Permutation equivariance

One key characteristic of GNN which distinguishes it from other neural networks is permutation equivalence. That is to say, the nodes in a graph do not have a canonical order, so how we “order” the nodes in a graph does not impact the results produced by GNNs.

Since particles of an object are “identical” in the particle-based simulation, they are permutation-equivariant when applying physics laws on them. Therefore, a permutation-equivariant model such as a GNN is suitable to simulate the interactions between particles.

Graphs

Graphs are powerful means of representing interactions between physical systems. A granular material media can be represented as a graph \( G=\left(V,E\right) \) consisting of a set of vertices (\(\mathbf{v}_i \in\ V\)) representing the soil grains and edges (\(\mathbf{e}_{i,j} \in\ E\)) connecting a pair of vertices (\(\mathbf{v}_i\) and \(\mathbf{v}_j\)) representing the interaction relationship between grains. We describe how graphs work by showing a simple example involving interaction between balls in a box (Figure 1a).

The state of the physical system (Figure 1a and 1d) can be encoded as a graph (Figure 1b and 1c). The vertices describe the balls, and the edges describe the directional interaction between them, shown as arrows in Figure 1b and 1c. The state of the ball i is represented as a vertex feature vector \(\mathbf{v}_i\) at \(i\). The feature vector includes properties such as velocities, mass, and distance to the boundary. The edge feature vector \(\mathbf{e}_{i,j}\) includes the information about the interaction between balls \(i\) and \(j\) such as the relative distance between the balls.

Graphs offer a permutation invariant form of encoding data, where the interaction between vertices is independent of the order of vertices or their position in Euclidean space. Rather, graphs represent the interactions through the edge connection, not affected by the permutation of the vertices. Therefore, graphs can efficiently represent the physical state of granular flow where numerous orderless particles interact by using vertices to represent particles and edges to their interaction.

Graph neural networks (GNNs)

GNNs are a state-of-the-art deep learning architecture that can operate on a graph and learn the local interactions. GNNs take a graph \(G=\left(\mathbf{V},\mathbf{E}\right)\) at time t as an input, compute properties and propagate information through the network, termed as message passing, and output an updated graph \(G^\prime=\left(\mathbf{V}^\prime,\mathbf{E}^\prime\right)\) with an identical structure, where \(\mathbf{V}^\prime\) and \(\mathbf{E}^\prime\) are the set of updated vertex and edge features (\(\mathbf{v}_i^\prime\) and \(\mathbf{e}_{i,\ j}^\prime\)). In the balls-in-a-box example, the GNN first takes the original graph \(G=\left(\mathbf{V},\mathbf{E}\right)\) (Figure 1b) that describes the current state of the physical system (\(\mathbf{X}^t\)). The GNN then updates the state of the physical system through message passing, which models the exchange of energy and momentum between the balls communicating through the edges, and returns an updated graph \(G^\prime=\left(\mathbf{V}^\prime,\mathbf{E}^\prime\right)\) (Figure 1c). After the GNN computation, we may decode \(G^\prime\) to extract useful information related to the future state of the physical system (\(\mathbf{X}^{t+1}\)) such as the next position or acceleration of the balls (Figure 1d).

**Figure. 1.** An example of a graph and graph neural network (GNN) that process the graph (modified from Battaglia et al. (2018)): (a) A state of the current physical system (\(\mathbf{X}^t\)) where the balls are bouncing in a box boundary; (b) Graph representation of the physical system (\(G\)). There are three vertices representing balls and six edges representing their directional interaction shown as arrows; (c) The updated graph (\(G^\prime\)) that GNN outputs through message passing; (d) The predicted future state of the physical system (\(\mathbf{X}^{t+1}\)) (i.e., the positions of the balls at the next timestep) decoded from the updated graph.

Message passing

Message passing consists of three operations: message construction (Eq. 1), message aggregation (Eq. 2), and the vertex update function (Eq. 3).

\[ \begin{equation} \mathbf{e}_{i,j}^\prime=\phi_{\mathbf{\Theta}_\phi}\left(\mathbf{v}_i,\mathbf{v}_j,\mathbf{e}_{i,\ j}\right) \tag{1} \end{equation} \]

\[ \begin{equation} {\bar{\mathbf{v}}}_i=\Sigma_{j \in N\left(i\right)}\ \mathbf{e}_{i,j}^\prime \tag{2} \end{equation} \]

\[ \begin{equation} \mathbf{v}_i^\prime=\gamma_{\mathbf{\Theta}_\gamma}\left(\mathbf{v}_i,{\bar{\mathbf{v}}}_i\right) \tag{3} \end{equation} \]

The subscript \(\mathbf{\Theta}_\phi\) and \(\mathbf{\Theta}_\gamma\) represent a set of learnable parameters in each computation. The message construction function \(\phi_{\Theta_{\phi}}\) (Eq. 1) takes the feature vector of the receiver and sender vertices (\(\mathbf{v}_i\) and \(\mathbf{v}_j\)) and the feature vector of the edge connecting them (\(\mathbf{e}_{i,\ j}\)) and returns an updated edge feature vector \(\mathbf{e}_{i,j}^\prime\) as the output.

\(\phi_{\Theta_{\phi}}\) is a matrix operation including the learnable parameter \(\mathbf{\Theta}_\phi\). The updated edge feature vector \(\mathbf{e}_{i,j}^\prime\) is the message sent from vertex \(j\) to \(i\). Figure 2a shows an example of constructing messages on edges directing to vertex 0 originating from vertices 1, 2, and 3 (\(\mathbf{e}_{0,1}^\prime, \mathbf{e}_{0,2}^\prime, \mathbf{e}_{0,3}^\prime\)). Here, we define the message construction function \(\phi_{\Theta_{\phi}}\) as \(\left(\left(\mathbf{v}_i+\mathbf{v}_j\right)\times\mathbf{e}_{i,j}\right)\times\mathbf{\Theta}_\phi\). The updated feature vector \(\mathbf{e}_{0,\ 1}^\prime\) is computed as \(\left(\left(\mathbf{v}_0+\mathbf{v}_1\right)\times\mathbf{e}_{0,1}\right)\times\mathbf{\Theta}_\phi\), where \(\mathbf{v}_0\) and \(\mathbf{v}_1\) are the receiver and sender vertex feature vectors, and \(\mathbf{e}_{0,1}\) is their edge feature vector. If we assume that all values of \(\mathbf{\Theta}_\phi\) are 1.0 for simplicity, we obtain \(\mathbf{e}_{0,\ 1}^\prime=(\left(\left[1,\ 0,\ 2\right]\right)+\left[1,\ 3,\ 2\right])\times\left[2,\ 1,\ 0\right]^T)\times1=[4,\ 3,\ 0]\). Similarly, we compute the messages \(\mathbf{e}_{0,\ 2}^\prime=\left[0,\ 3,\ 9\right]\) and \(\mathbf{e}_{0,\ 3}^\prime=\left[3,\ 4,\ 9\right]\).

The next step in message passing is the message aggregation \(\Sigma_{j \in N\left(i\right)}\) (Eq. 2), where \(N\left(i\right)\) is the set of sender vertices \(j\) related to vertex \(i\). It collects all the messages directing to vertex \(i\) and aggregates those into a single vector with the same dimension as the aggregated message (\({\bar{\mathbf{v}}}_i\)). The aggregation rule can be element-wise vector summation or averaging, hence it is a permutation invariant computation. In Figure 2a, the aggregated message \(\bar{\mathbf{v}_0}=\left[7,10,18\right]\) is the element-wise summation of the messages directing to vertex 0 as \(\bar{\mathbf{v}_o}=\mathbf{e}_{0,\ 1}^\prime+\ \mathbf{e}_{0,\ 2}^\prime+\ \mathbf{e}_{0,\ 3}^\prime\).

The final step of the message passing is updating vertex features using Eq. 3. It takes the aggregated message (\({\bar{\mathbf{v}}}_i\)) and the current vertex feature vector \(\mathbf{v}_i\), and returns an updated vertex feature vector \(\mathbf{v}_i^\prime\), using predefined vector operations including the learnable parameter \(\mathbf{\Theta}_\gamma\). Figure 2b shows an example of the update at vertex 0. Here, we define the update function \(\gamma_{\Theta_{\gamma}}\) as \(\mathbf{\Theta}_\gamma\left(\mathbf{v}_i+{\bar{\mathbf{v}}}_i\right)\). The updated feature vector \(\mathbf{v}_0^\prime\) is computed as \(\mathbf{\Theta}_\gamma\left(\mathbf{v}_0+{\bar{\mathbf{v}}}_\mathbf{0}\right)\).

Assuming all parameters in \(\mathbf{\Theta}_\gamma\) are 1.0 for simpliticy, we obtain \(\mathbf{v}_0^\prime=\left[1,\ 0,\ 2\right]+\left[7,\ 10,\ 18\right]=\left[8,10,20\right]\). Similarly, we update the other vertex features (\(\mathbf{v}_1^\prime, \mathbf{v}_2^\prime, \mathbf{v}_3^\prime\)).

At the end of the message passing, the graph vertex and edge features (\(\mathbf{v}_i\) and \(\mathbf{e}_{i,\ j}\)) are updated to \(\mathbf{v}_i^\prime\) and \(\mathbf{e}_{i,\ j}^\prime\). The GNN may include multiple message passing steps to propagate the information further through the network.

**Figure 2.** An example of message passing on a graph: (a) message construction directing to receiver vertex 0 (\(\mathbf{e}_{0,\ 1}^\prime, \mathbf{e}_{0,\ 2}^\prime, \mathbf{e}_{0,\ 3}^\prime\)) and the resultant aggregated message (\({\bar{\mathbf{v}}}_0\)); (b) feature update at vertex 0 using \({\bar{\mathbf{v}}}_0\). Note that we assume \(\mathbf{\Theta}_\phi\) and \(\mathbf{\Theta}_r\) are 1.0 for the convenience of calculation.

Unlike the example shown above, where we assume a constant value of 1.0 for the learnable parameters, in a supervised learning environment, the optimization algorithm will find a set of the best learnable parameters (\(\mathbf{\Theta}_\phi, \mathbf{\Theta}_\gamma\)) in the message passing operation.

Back

© | Cornell University | Center for Advanced Computing | Copyright Statement | Access Statement
CVW material development is supported by NSF OAC awards 1854828, 2321040, 2323116 (UT Austin) and 2005506 (Indiana University)