CERN hosts live reddit Q&A on CMS open data

Last week, CERN conducted its fourth "Ask Me Anything" (AMA) on the social media platform reddit, to discuss the public release of 300 terabytes of data from the CMS collaboration.

The latest release of research-quality open data received a huge response from the media as well as on reddit. With a large portion of the traffic to the CERN Open Data Portal in the last month coming from reddit itself, the AMA provided an opportunity to discuss the open-data release directly with interested members of this community.

For the AMA, reddit users posed questions to a panel including: Tibor Simko from CERN’s IT department, the lead developer of the CERN Open Data Portal that serves the data; Anxhela Dani from the CERN Scientific Information Services; CMS Data Preservation Coordinator Kati Lassila-Perini; and Tom McCauley from CMS.

Questions ranged from what the biggest challenges on making 300TB of data available to the public were, to whether any of the public analysis has had surprising results.

One user wanted to know what CMS hoped the open data would be used for and how the data that were made public were selected.

“We are hoping to see scientific studies and to see them used in education. The data made open is not actually a small selection, it is approximately half of the collision data we've collected for each year of data taking.” - Kati Lassila-Perini

The CERN Open Data Portal makes public the data from experiments at the world's most powerful particle accelerator, the Large Hadron Collider (LHC), along with the software and documentation needed to analyse them.

