Preparing for tomorrow's big data

Last week, the inaugural ISC Big Data conference was held in Heidelberg, Germany. The event was chaired by Sverre Jarpchief technology officer of CERN openlab, and CERN was the focus of two case studies presented during the two-day conference. Frank Würthwein, from the University of California at San Diego, US, discussed how CERN handles big data today and looked forward to how the organization will have to adapt these processes to cope with increased peak data rates from the experiments on the Large Hadron Collider (LHC) after upgrade works are completed as part of the first long shutdown (LS1).

Until recently, the large CERN experiments, ATLAS and CMS, owned and controlled the computing infrastructure they operated on in the US, and accessed data only when it was locally available on the hardware they operated. However, Würthwein explains, with data-taking rates set to increase dramatically by the end of the first long shutdown in 2015, the current operational model is no longer viable to satisfy peak processing needs. Instead, he argues, large-scale processing centers need to be created dynamically to cope with spikes in demand. To this end, Würthwein and colleagues carried out a successful proof-of-concept study, in which the Gordon Supercomputer at the San Diego Supercomputer Center was dynamically and seamlessly integrated into the CMS production system to process a 125-terabyte data set.

Read more: "Preparing for tomorrow's big data" – International Science Grid This Week