Artificial Intelligence

Aspen Open Jets

Merging data and simulation for machine learning

Project: Aspen Open Jets: Unlocking LHC Data for Foundation Models in Particle Physics


In 2024, members of the CMS collaboration released a new dataset called Aspen Open Jets, combining simulated jets with real data from proton-proton collisions at the Large Hadron Collider. This dataset consists of approximately 180 million jets and is designed to be used for training high-energy physics foundation models — powerful computer programs that can learn from large amounts of data and then be adapted to new tasks.

Researchers are using this data to develop models that predict the behavior of jets — collimated sprays of particles that are key to understanding particle collisions. The Aspen Open Jets dataset allows the training these foundation models, and early results show significant improvements in jet prediction, opening up new possibilities for exploring the fundamental physics of the universe.

Image credit: CERN

Making complex physics simulation more efficient

Project: Efficient many-jet event generation with Flow Matching

High-energy physics requires a lot of very accurate simulation data, which can be computationally expensive. Using Flow Matching combined with Continuous Normalizing Flows, an AI method that can run data both ways through the model, simulations can run up to 150 times more efficiently without losing accuracy.   

AI models speed up particle simulations at the HL-LHC

Project: CaloDiffusion: Denoising diffusion models with geometry adaptation for high fidelity calorimeter simulation

Researchers are using innovative computing programs, called generative machine learning models, to simulate data from the upcoming High-Luminosity Large Hadron Collider. These models are faster and more efficient than traditional physics-based computing simulations, which is important given that the new collider will generate huge amounts of data from complex detectors. 

One of these programs, called CaloDiffusion, uses 3D computer models to simulate how particles interact with detectors. It belongs to a new class of machine learning models known as denoising diffusion models, which have recently become the leading approach in image generation tasks. CaloDiffusion adapts to the detector’s geometry, including its irregular structures, and produces results that closely match traditional simulations, offering a powerful and scalable solution for future collider experiments.

Comparison between traditional simulation (solid) and CaloDiffusion

AI-enabled detectors for faster particle tracking in physics experiments

Project: Towards on-sensor inference of charged particle track parameters and uncertainties

Tracking particles in high-energy physics experiments presents a unique challenge. Next-generation detectors will enable more precise measure of the angles at which particles pass through. While this technology enhances offline tracking, its full potential remains limited by constraints at the lowest-level hardware trigger. To address this, Fermilab researchers are integrating mixture density networks directly into the detector hardware. These networks can estimate particle angles and positions, along with associated uncertainty, which greatly improve the speed of the tracking process.

Inference-as-a-service for offloading AI computing

Project: SuperSONIC

Modern scientific experiments generate sheer amounts of data, placing growing demands on computing resources that traditional processors can no longer meet.

The SuperSONIC project addresses this challenge by creating a shared server infrastructure that seamlessly integrates advanced compute accelerators — such as graphic processing units, or GPUs, neural processing units, or NPUs, field-programmable gate arrays , FPGAs, and other cutting-edge technology — designed for computationally intensive tasks, especially those involving machine learning. Designed for reuse across multiple scientific experiments, the system offers a scalable solution to resource allocation challenges and helps optimize data processing workflows.

Chart showing SuperSONIC performance

Finding anomalous physics events at the LHC

Project: Real-time Anomaly Detection at the L1 Trigger of CMS Experiment


Scientists at CERN’s CMS experiment are using artificial intelligence to identify unusual particle collisions in real time, sifting through data from 40 million collisions per second. This allows them detect potential signs of new physics, while managing a data stream that only allows approximatively 1,000 events per second to be saved for further study.

AXOL1TL, short for Anomaly eXtraction Online Level-1 Trigger aLgorithm, uses an algorithm called Autoencoder to identify atypical events in the particle collider. Running on an FPGA, it makes real-time decisions within nanoseconds of collision in the detector, capturing events that would otherwise be missed and ultimately increasing the chance of discovery.

Event display