11 Jul/24
13:30 - 15:30 (Europe/Zurich)

Reinforcement learning and its applications at CERN


31/3-004 at CERN


Reinforcement Learning (RL) has emerged as a powerful paradigm in artificial intelligence and has found exciting applications in various fields, including particle accelerators at CERN. This introductory lecture aims to provide an overview of RL and its application in optimizing beam steering in the AWAKE beamline and automating bunch splitting in the Proton Synchrotron (PS) at CERN.

The lecture will begin with an introduction to the fundamentals of RL, where we will explore the concept of agents learning to make decisions through interaction with an environment. Central to RL is the concept of Markov Decision Processes (MDPs), which model sequential decision-making problems. We will discuss the components of an MDP, including states, actions, rewards, and transition probabilities.

Next, we will delve into sample-based methods, such as Monte Carlo and Temporal Difference (TD) learning, which are essential for learning in uncertain and dynamic environments. These techniques allow RL agents to estimate value functions and improve their policies through experience.

Model-based RL will be introduced as an approach to learn a model of the environment to aid in decision-making. Dyna, a well-known model-based RL algorithm, will be presented as an example of how this integration can be achieved.

Function approximation techniques applied to both parametric value functions and policies will be covered. These methods enable RL agents to handle high-dimensional state and action spaces, making them valuable tools in complex applications.

The lecture will conclude by introducing actor-critic methods, which combine the advantages of both policy gradient and value-based methods. We will briefly discuss how these algorithms facilitate efficient learning and convergence in RL tasks.

Finally, we will transition to the exciting RL applications in particle accelerators at CERN. Specifically, we will showcase RL's potential in optimizing beam steering in the AWAKE beamline and automating bunch splitting in the Proton Synchrotron (PS). By applying the theory developed earlier, we will demonstrate how RL can lead to significant performance improvements and operational efficiencies in these critical accelerator systems.

Short bio:

Matteo Bunino earned a MSc in Data Science and Computer Engineering from both the Polytechnic University of Turin (Italy) and EURECOM (France), in the context of a double degree program. He did his thesis at Huawei's Munich Research Center (MRC), where he developed a prototype for analyzing dynamically evasive malware with the aid of reinforcement learning. Afterward, he continued working at Huawei's AI4Sec research team, at the intersection between cyber security and machine learning to develop intelligent methods for analysing malware.

Currently, Matteo is a fellow in the IT department at CERN and he is working on interTwin, a European project aimed at developing a unified digital twin engine (DTE) for science. In particular, Matteo is focusing on the development of "itwinai", a framework for advanced MLOps on cloud and HPC.