Particle collisions as graphs: machine learning from data and topology

Jacan Chaplais

Srinandan Dasmahapatra

Stefano Moretti

Wednesday, 7th September 2022

Why graphs?

Introduction to graph structured data and analysis.

Graph definition

Graphs are minimally comprised of two pieces of information:

  1. Set of data points, known as nodes or vertices \(V\)
  2. Set of relations between points, known as links or edges \(E\)

\[ G = \left( V, E, u \right) \text{ or } G\left( V, E, u \right) \qquad(1)\]

Edges between nodes can be described by an adjacency matrix, \[ A_{ij} = \begin{cases} 1 & \exists \; v_j \in \mathcal{N}(v_i) \\ 0 & \text{else} \end{cases} \qquad(2)\] where \(\mathcal{N}(v_i)\) is the neighbourhood of node \(v_i\). The adjacency matrix is therefore a sparse binary matrix of order \(|V| \times |V|\).

Images are graphs

Image convolution operation.

Generalising the update

Each node has bespoke computation. Applies order invariant agg func over neighbours, then embeds with update func.

Edges are powerful

Edges can do more than just hold parametrised weights: they can model pairwise relations. These can be embedded, just like the nodes.

\[ \mathbf{e}^{(l)}_{s r} = \operatorname{ACT}(\mathbf{W}_e^{(l)} \cdot [ \mathbf{v}_r^{(l - 1)} || \mathbf{v}_s^{(l - 1)} || \mathbf{e}_{s r}^{(l - 1)} ]) \qquad(3)\]

The learned edge features can then be aggregated to form messages, to update the nodes.

\[ \mathbf{v}_r^{(l)} = \operatorname{ACT}(\mathbf{W}_v^{(l)} \cdot [ \mathbf{v}_r^{(l - 1)} || \operatorname{AGG}( \{\mathbf{e}_{s r}^{(l)}, \forall v_s \in \mathcal{N}(v_r)\}) ]) \qquad(4)\]

Collider data as a graph

Using graphs to formulate the physics problem.

Interactions during decay

A top and anti-top pair produced by a proton collision. These decay into a bottom and anti-bottom pair, and W bosons, which themselves continue decaying through a process called showering and hadronisation, until detected.

Data incident on the detector

Flow tracing

Flow tracing diffuses quantities among descendants proportionally.

Learning flow tracing with GNNs

Message passing

Feature vectors are made up of 4-momenta and charge,

\[ \mathbf{v}^i = \begin{bmatrix}E^i & p_x^i & p_y^i & p_z^i & Q^i\end{bmatrix}^T \qquad(5)\]

and edges are formed within a radius of \(\Delta R = R_0\)

\[ A_{ij} = \begin{cases} 1 & \Delta R_{i j} < R_0 \\ 0 & \text{else} \end{cases} \qquad(6)\]

Message passing is used to create embeddings of nodes based on community structure, and regression is used to predict flow tracing of particles from ancestors.

Results

References