ACAT 2002, Moscow June 24-28thJ. Hernández. DESY-Zeuthen1 Offline Mass Data Processing using Online Computing Resources at HERA-B José Hernández DESY-Zeuthen
ACAT 2002, Moscow June 24-28thJ. Hernández. DESY-Zeuthen2 Motivation Commodity hardware (PC farms, Ethernet networks) and Linux OS, used in online environment, allow to blur the sharp border online/offline HERA-B uses successfully Linux PC Farms in the Trigger and Data Acquisition systems Traditionally in HEP experiments, online and offline computing and software are sharply separated Different environment and requirements Dedicated hardware and software in DAQ and trigger The reconstruction, typically offline task, is done online at HERA-B HERA-B uses online computing and software resources to perform offline data reprocessing and MC production in the online PC Farms
ACAT 2002, Moscow June 24-28thJ. Hernández. DESY-Zeuthen3 HERA-B DAQ DSP switch High Bandwidth (10 Gbps) Low Latency (<10 s) Online PC Farms
ACAT 2002, Moscow June 24-28thJ. Hernández. DESY-Zeuthen4 Online PC Farms L2/L3 trigger step 240 nodes Intel Celeron 1.3 GHz 256 MB RAM Fast Ethernet NIC Linux OS No real time extensions CAN card for slow control Temperature, Power up/down Online reconstruction & L4 Trigger 100 dual-CPU nodes Intel PIII 550 MHz 256 MB RAM Fast Ethernet NIC Linux OS No real time extensions CAN card for slow control Temperature, Power up/down L2/L3 FarmL4 Farm Diskless PCs PROM in NIC loads Linux Extremely ease maintenance DSP-to-PCI interface data link to DSP switch (40 MB/s, 1 s driver latency)
ACAT 2002, Moscow June 24-28thJ. Hernández. DESY-Zeuthen5 L4 FARM tasks Full online event reconstruction Allow immediate physics analysis Avoid relatively slow access to tape (20 TB/year) Full online reconstruction allows online Data Quality Monitoring and Online Calibration and Alignment Online Event Classification and Selection Mark events in physics categories (event directories) L4 trigger step Data logging Add reconstruction info to event and send to logger
ACAT 2002, Moscow June 24-28thJ. Hernández. DESY-Zeuthen6 L4 FARM Software Linux environment Process server Frame Program ARTE Reconstruction, analysis and MC Same code online and offline Data I/O shm memory based (online) and file based (offline) Event reconstruction time ~ 4 sec 50 Hz output rate Process Server
ACAT 2002, Moscow June 24-28thJ. Hernández. DESY-Zeuthen7 Online DQM and CnA Online CnA to keep trigger performance & online reconstruction DQM from reconstructed data Gathering system to increase statistics CnA version tag in event data CnA constants multicasted to L2 nodes by DAQ CnA constants retrieved from DB by L4 nodes when new CnA tag in events
ACAT 2002, Moscow June 24-28thJ. Hernández. DESY-Zeuthen8 Booting and State Machine Each run has ~2000 process. (~ 400 are under State Machine). The run is booted in 3 minutes (~10 process/s ). Different machine types: Linux, Lynx and DSP. & the same protocol. The State Machine maps different transition different levels in the State Machine tree. All procresses are booted remotely in different machines using the messaging system.
ACAT 2002, Moscow June 24-28thJ. Hernández. DESY-Zeuthen9 Offline Online Idea: Use online idle time to perform offline mass data processing using the online computing resources Shutdown periods, time between spills, accelerator down time Use vast online computing resources 440 CPUs, high network bandwidth Use not only online hardware but also online processes and protocols: Use online boot and control systems Use online data transmission protocols Perform “online” Data Quality Monitoring Run “quasi-online” Data re-processing and Monte Carlo production
ACAT 2002, Moscow June 24-28thJ. Hernández. DESY-Zeuthen10 Data Taking L2 Buffers DSP switch L2/L3 Farm Ethernet switch L4 Farm EVC L4C Archiver TAPE
ACAT 2002, Moscow June 24-28thJ. Hernández. DESY-Zeuthen11 Data Re-processing L2/L3 Farm Ethernet switch L4 Farm Archiver TAPE Provider
ACAT 2002, Moscow June 24-28thJ. Hernández. DESY-Zeuthen12 Monte Carlo Production L2/L3 Farm Ethernet switch L4 Farm Archiver TAPE Full Monte Carlo Production: Generation, Detector Simulation, Digitization, Trigger Simulation and Full Reconstruction GHz node. 300 KB/evt 1 Million evts/day, 300 Gbytes/day
ACAT 2002, Moscow June 24-28thJ. Hernández. DESY-Zeuthen13 Quasi-Online Processing System fully integrated in the Run Control System Shift crew can use efficiently the online idle time Same online processes and protocols used for booting, control, monitoring, data reconstruction, data quality, logging and archiving Data Reprocessing = Online Reconstruction
ACAT 2002, Moscow June 24-28thJ. Hernández. DESY-Zeuthen14 Summary Efficient use of online computing resources at HERA-B to perform mass offline data processing Not only the online hardware is used but also the online boot, control, monitoring and data transmission processes and protocols LHC experiments might consider to include the online computing power as GRID resources in order to use the online idle time for offline mass data processing