mardi 26 août 2008

Large Hadron Collider as Massive Grid Computer - O'Reilly News

This article is being published alongside a 45-minute interview with Brian Cox who works at CERN on the ATLAS and CMS experiments. To listen to this interview, click here.

Part of the O'Reilly News mission is to dig deeper into stories like the Large Hadron Collider (LHC) at CERN and get a more concrete sense of the technology behind the story. Everyone seems to know what the LHC is and that it is going to be switched on later this year, and many of us watched the amazing presentation by Brian Cox at TED 2008. Yet, most of the information you find about the experiment has to be distilled for consumption by the general public. To use an Anthropology term it has been fetishized. Everyone "knows" that the LHC is going to answer age-old mysteries about the structure of matter and space-time, but few have a grasp of the concrete experiments and esoteric science behind the general-audience news stories. When NPR or the network news reports on a particle accelerator it is reporting it as a quasi-religious artifact - it is awe-inspiring "magic". We wanted to try to cover this story from a technology perspective, make it more concrete for a technical audience, and, in doing so, uncover some of the interesting stories other news outlets might have missed. Did you know that the main analysis package used at CERN is freely available, more than 10 years old, and covered by an LGPL license? Do you know how many CPUs make 1 MSI2K? What do you do when your experiment generates 2 GB every ten seconds?

1. The Large Hadron Collider: Computing on a Massive Scale

CERN has already demonstrated an ability to dramatically affect computing - the World Wide Web was created by Tim Berners-Lee (and Robert Cailliau) to support the documentation required for CERN operations. As you'll see in this article, the data processing requirements of the ATLAS and CMS experiments at CERN's LHC push the envelope of modern day computing and force scientists and engineers to create new software and hardward to address the unique requirements at the leading edge of science. There's an availability and network engineering challenge that dwarfs anything you'll ever work on, and there are people working on systems on a scale familiar only to people who happen to work at Google (or the secret caverns of the National Security Agency). There are, no doubt, other unintended consequences of the systems which are about to be turn on as this, the largest scientific experiment in history is turned on.

When the LHC is turned on, it will be more than just a 27-km wide particle accelerator buried 100m deep in Geneva colliding protons. When the LHC is running it will be colliding millions of protons per second, and all of this data will need to be captured and processed by a world-wide grid of computing resources. Correction, by a massively awe-inspiring and mind-numbingly vast array of computing resources grouped into various tiers so large that it is measured in units like Petabytes and MSI2K.

1.1 Tier-0: Geneva

The detectors in the Compact Muon Solenoid (CMS) are sensitive enough to capture the tiniest sub-atomic particles. The detectors will capture anywhere from 2 to 30 proton-proton interactions per event snapshot and they will be generating anywhere from 100 to 200 event snapshots per second. The detector will be creating 2 GB of data every 10 seconds stored in what is called the Tier-0 data center in Geneva. The Tier-0 data center is going to make heavy use of tape, and one slide deck from 2005 states that a Tier-0 data center needs 0.5 PB of disk storage, CPU capacity of 4.6 MSI2K, and a WAN with a capacity greater than 5 Gbps. Once the data is collected it is streamed to seven Tier-1 data centers which take on much of the responsibility for maintaining the data.

What does MSI2K stand for? "Mega SPECint 2000". SPECint 2000 is a standard measure of the power of a CPU. For an in depth explanation see Wikipedia. If we assume a 2 x 3.0 GHz Xeon CPU is 2.3 KSI2K, then it would take about 430 of those CPUs to equal 1 MSI2K. 4.6 MSI2K is going to involve thousands of CPUs dedicated to data extraction and analysis.

1.2 Tier-1: Fermilab (US), RAL (UK), GridKa, others

This raw data must then be analyzed to identify different particles and "jets" (collections of particles associated with interactions). After the raw data is analyzed and reconstructed it is then archived in Tier-1 data centers which are distributed throughout the world (such as Fermilab in Chicago). CMS Twiki Page on Tier-1 Data Centers says that the annual requirements for Tier-1 data center are 2.2 Petabytes of storage (yes, Petabytes) and each Tier-1 data center needs to be able about to handle 200 MB/s from Tier-0 (Geneva) which works out to something like a 2.5 Gbps dedicated line used only for LHC experimental data (some documents suggest that a Tier-1 data center needs > 10 Gbps as it also has to support connections to multiple Tier-2 data centers). A Tier-1 data center also needs to dedicate about 2.5 MSI2K (~1000 high-end CPUs) to the data analysis and extraction computing effort and maintain 1.2 Petabytes of disk storage and 2.8 Petabytes of tape storage. It looks like Tier-1 data centers are going to act as the archive and central collaboration hubs for an even larger number of Tier-2 data centers.

2.5 Gbps across the Atlantic? I can't even get Comcast to come fix my broken cable modem. How's this going to work? There is a project called DataTAG which aims to create an advanced, high-performance data link for research between the EU and the US. Participating organizations are laboratories, universities, and networks like Internet2 which already offer 10 Gbps network connections to research universities and organizations.

1.3 Tier-2 Data Centers

According to a recent newsletter from Fermilab there are over one hundred Tier-2 data centers. When you finally hear about some huge breakthrough in particle physics it will be because someone ran an analysis at a Tier-2 data center that analyzed millions (or billions) of particle interactions and identified some events that fit a theory or a model. A Tier-2 data center needs at least a couple hundred Terabytes, just shy of 1 MSI2K, and something like 500 Mbps sustained to support operations.

1.4 Most Distributed Scientific Computing System Ever

When you hear that they've finally flipped the switch, you'll have an idea as to the heavy computing that is going on every single second. This isn't just a 27-km ring in Geneva smashing protons together, this is the most complex scientific computing system to date. For more information about the CMS Computing Model see CMS Computing Model on the CMS Twiki.

2. What's in this Data?

We've discussed the architecture and organization of the computing resources, what about the data that is being stored and analyzed. For clues about the data format and storage medium we can look to the web to provide us with clues. I found the following talk titled CMS 'AOD' Model Presentation from March 2007. In this talk, Lista disusses the CMS Event Data Model (EDM) which talks about accessing data from a CMS Event. In this presentation, you'll see some technical specifics. On slide four, you'll see the statement Events are written using POOL with ROOT as underlying technology.

It appears that POOL and ROOT are two custom projects for the CMS (Compact Muon Solenoid) project at the LHC. It also looks like many of these projects are open source and freely available.

2.1 Tracking Down POOL and ROOT

A quick Google search for "LHC POOL ROOT" will bring up various references one of which is a paper published in IEEE Transactions on Nuclear Science. Typical of most LHC-related papers in peer reviewed journals, this paper has greater than ten authors. The Chytracek, et al. paper is entitled "POOL Development Status and Production Experience", and this three-year old paper has the following abstract:


(lire la suite)

Aucun commentaire: