Processing: What to record?

Kilometres of cabling connect the LHC's four main detectors with servers in the CERN Data Centre (Image: CERN)

The volume of data produced at the Large Hadron Collider (LHC) presents a considerable processing challenge.

Particles collide at high energies inside CERN's detectors, creating new particles that decay in complex ways as they move through layers of subdetectors. The subdetectors register each particle's passage,  and microprocessors convert the particles' paths and energies into electrical signals, combining the information to create a digital summary of the "collision event". The raw data per event is around one million bytes (1 Mb), produced at a rate of about 600 million events per second.

The Worldwide LHC Computing Grid tackles this mountain of data in a two-stage process. First, it runs dedicated algorithms to reduce the number of events that CERN physicists are either already familiar with or consider uninteresting. The physicists can focus their analysis on the most important data – that which could bring new physics discoveries.

In the first stage of the selection, the number of events is filtered from the 600 million or so per second picked up by detectors to 100,000 per second sent for digital reconstruction. In a second stage, more specialized algorithms further process the data, leaving only 100 or 200 events of interest per second. This raw data is recorded onto servers at the CERN Data Centre at a rate around 1.5 CDs per second (approximately 1050 megabytes per second). Computer scientists at CERN work continously to improve detector-calibration methods, and to refine processing algorithms to detect ever more interesting events.