Big data takes ROOT

Example of a plot created with the help of the ROOT tool (Image: ROOT)

Particle physicists don’t break into a sweat when they face big data. On the contrary: they need it in order to be able to tell a rare process from a common one. Reliable statistics are essential here, and physicists gather statistics by producing as many particle collisions as possible. At the Large Hadron Collider (LHC), protons collide some 1 billion times per second, and the CERN data centre store more than 30 petabytes of data per year from the LHC experiments.

Collecting data is one thing; actually analysing it is another. For 20 years now scientists have been using an open-source tool set called ROOT that was developed at CERN and Fermilab to compute vast amounts of data very efficiently. If you have ever seen a high-energy physics plot with curves, a physics histogram or a sophisticated 3D function, you can almost be sure that it was made with ROOT. Every particle physics graduate student who ever had run a data analysis learns to work with ROOT, and this is what makes it so powerful: everybody in particle physics knows how to use it, it defines the visuals physicists use to communicate, it handles large data easily, and it is versatile.

However, it is not perfect, so it is constantly updated and improved with new features. The ROOT team based at CERN has just released version 6 and is already starting to discuss, prototype and develop version 7 with many improved features. “The new version is going to be faster and simpler,” says Axel Naumann, core developer on the CERN ROOT team. The release of version 7 is planned for the next long shutdown of the LHC so that users have time to get to know to it. However, some features will continue to be rolled out individually until then. The team consists of six people, and while they like to spend their time developing the analysis tool set further, a huge part of their everyday work is support. “We serve a community of many thousands of people,” Naumann says.

Big data is not just a hot topic in particle physics. Other areas of science like astronomy or biology handle comparably large amounts of data and have also started using ROOT, and it is taking root in the aerospace industry as well as in the finance sector. The team also sees a lot of potential for industry benefiting from ROOT. “We’ve got a world of statistical analysis possibilities and we’ve got experience with vast amounts of data – all that can feed into a business plan, improve logistics and thus turnover,” says Naumann. The ROOT team is happy to discuss options.