Experiments at CERN generate colossal amounts of data. The Data Centre stores it, and sends it around the world for analysis

Approximately 600 million times per second, particles collide within the Large Hadron Collider (LHC). Each collision generates particles that often decay in complex ways into even more particles. Electronic circuits record the passage of each particle through a detector as a series of electronic signals, and send the data to the CERN Data Centre (DC) for digital reconstruction. The digitized summary is recorded as a "collision event". Physicists must sift through the 30 petabytes or so of data produced annually to determine if the collisions have thrown up any interesting physics.

CERN does not have the computing or financial resources to crunch all of the data on site, so in 2002 it turned to grid computing to share the burden with computer centres around the world. The Worldwide LHC Computing Grid (WLCG) – a distributed computing infrastructure arranged in tiers – gives a community of over 8000 physicists near real-time access to LHC data. The Grid builds on the technology of the World Wide Web, which was invented at CERN in 1989.

The server farm in the 1450 m2 main room of the DC (pictured) forms Tier 0, the first point of contact between experimental data from the LHC and the Grid. As well as servers and data storage systems for Tier 0 and further physics analysis, the DC houses systems critical to the daily functioning of the laboratory. The servers undergo continual maintenance and upgrades to make sure that they will operate in the event of a serious incident such as an extended power cut. Critical servers are held in their own room, powered and cooled by dedicated equipment.

Explore the CERN Data Centre with Google Street View (Image: Google Street View)

By early 2013 CERN had increased the power capacity of the centre from 2.9 MW to 3.5 MW, allowing the installation of more computers. In parallel, improvements in energy-efficiency implemented in 2011 have led to an estimated energy saving of 4.5 GWh per year.

In a complementary effort to cope with the increasing requirements for LHC computing, the Wigner Research Centre for Physics in Budapest, Hungary, operates as an extension to the DC. The Wigner Data Centre acts as a remote Tier 0, hosting CERN equipment to extend the Grid's capabilities. The site also ensures full business continuity for the critical systems in case of a major problem on CERN's site at Meyrin in Switzerland. The Meyrin site currently provides some 45 petabytes of data storage on disk, and includes the majority of the 100,000 processing cores in the CERN DC. The Wigner DC will extend this capacity with 20,000 cores and 5.5 petabytes of disk data, and will see this doubling after 3 years.

The Data Centre processes about one petabyte of data every day - the equivalent of around 210,000 DVDs. The centre hosts 11,000 servers with 100,000 processor cores. Some 6000 changes in the database are performed every second.

The Grid runs more than two million jobs per day. At peak rates, 10 gigabytes of data may be transferred from its servers every second.

Voir en français


Processing: What to record?

Collisions in the LHC produce too much data to record. To tackle the torrent, the Grid focuses in on interesting physics in a 2-step selection process

The Worldwide LHC Computing Grid

A global collaboration of computer centres distributes and stores LHC data, giving real-time access to physicists around the world

The Grid: A system of tiers

The Grid is composed of four levels, or "tiers". Between them they process, store and analyse all the data from the Large Hadron Collider

The Grid: Software, middleware, hardware

Find out how physics software, middleware, hardware and networking all contribute to the Worldwide LHC Computing Grid

Net neutrality and CERN

CERN maintains that access to scientific data for its entire scientific community should be determined solely by the scientific process

Updates related to computing