Stephen Wolbers CHEP2000 February 7-11, 2000 Stephen Wolbers CHEP2000 February 7-11, 2000 CDF Farms Group: Jaroslav Antos, Antonio Chan, Paoti Chang, Yen-Chu.

Slides:



Advertisements
Similar presentations
Amber Boehnlein, FNAL D0 Computing Model and Plans Amber Boehnlein D0 Financial Committee November 18, 2002.
Advertisements

23/04/2008VLVnT08, Toulon, FR, April 2008, M. Stavrianakou, NESTOR-NOA 1 First thoughts for KM3Net on-shore data storage and distribution Facilities VLV.
ACAT 2002, Moscow June 24-28thJ. Hernández. DESY-Zeuthen1 Offline Mass Data Processing using Online Computing Resources at HERA-B José Hernández DESY-Zeuthen.
CHEP04 - Interlaken - Sep. 27th - Oct. 1st 2004T. M. Steinbeck for the Alice Collaboration1/20 New Experiences with the ALICE High Level Trigger Data Transport.
CHEP04 - Interlaken - Sep. 27th - Oct. 1st 2004T. M. Steinbeck for the Alice Collaboration1/27 A Control Software for the ALICE High Level Trigger Timm.
1 Data Storage MICE DAQ Workshop 10 th February 2006 Malcolm Ellis & Paul Kyberd.
The D0 Monte Carlo Challenge Gregory E. Graham University of Maryland (for the D0 Collaboration) February 8, 2000 CHEP 2000.
Online Systems Status Review of requirements System configuration Current acquisitions Next steps... Upgrade Meeting 4-Sep-1997 Stu Fuess.
Introduction and Overview Questions answered in this lecture: What is an operating system? How have operating systems evolved? Why study operating systems?
GLAST LAT ProjectDOE/NASA Baseline-Preliminary Design Review, January 8, 2002 K.Young 1 LAT Data Processing Facility Automatically process Level 0 data.
1 Data Management D0 Monte Carlo needs The NIKHEF D0 farm The data we produce The SAM data base The network Conclusions Kors Bos, NIKHEF, Amsterdam Fermilab,
The SAMGrid Data Handling System Outline:  What Is SAMGrid?  Use Cases for SAMGrid in Run II Experiments  Current Operational Load  Stress Testing.
Remote Production and Regional Analysis Centers Iain Bertram 24 May 2002 Draft 1 Lancaster University.
Jan 3, 2001Brian A Cole Page #1 EvB 2002 Major Categories of issues/work Performance (event rate) Hardware  Next generation of PCs  Network upgrade Control.
03/27/2003CHEP20031 Remote Operation of a Monte Carlo Production Farm Using Globus Dirk Hufnagel, Teela Pulliam, Thomas Allmendinger, Klaus Honscheid (Ohio.
Central Reconstruction System on the RHIC Linux Farm in Brookhaven Laboratory HEPIX - BNL October 19, 2004 Tomasz Wlodek - BNL.
CDF data production models 1 Data production models for the CDF experiment S. Hou for the CDF data production team.
November 3, FBSNG Overview Jim Fromm Farms and Clustered Systems Group, Computing Division, Fermilab.
November 7, 2001Dutch Datagrid SARA 1 DØ Monte Carlo Challenge A HEP Application.
D0 Farms 1 D0 Run II Farms M. Diesburg, B.Alcorn, J.Bakken, T.Dawson, D.Fagan, J.Fromm, K.Genser, L.Giacchetti, D.Holmgren, T.Jones, T.Levshina, L.Lueking,
Farm Management D. Andreotti 1), A. Crescente 2), A. Dorigo 2), F. Galeazzi 2), M. Marzolla 3), M. Morandin 2), F.
D0 SAM – status and needs Plagarized from: D0 Experiment SAM Project Fermilab Computing Division.
Snapshot of the D0 Computing and Operations Planning Process Amber Boehnlein For the D0 Computing Planning Board.
Jean-Yves Nief CC-IN2P3, Lyon HEPiX-HEPNT, Fermilab October 22nd – 25th, 2002.
Fermilab User Facility US-CMS User Facility and Regional Center at Fermilab Matthias Kasemann FNAL.
CHEP'07 September D0 data reprocessing on OSG Authors Andrew Baranovski (Fermilab) for B. Abbot, M. Diesburg, G. Garzoglio, T. Kurca, P. Mhashilkar.
Finnish DataGrid meeting, CSC, Otaniemi, V. Karimäki (HIP) DataGrid meeting, CSC V. Karimäki (HIP) V. Karimäki (HIP) Otaniemi, 28 August, 2000.
T. Bowcock Liverpool Sept 00. Sept LHCb-GRID T. Bowcock 2 University of Liverpool Successes Issues Improving the system Comments.
International Workshop on HEP Data Grid Nov 9, 2002, KNU Data Storage, Network, Handling, and Clustering in CDF Korea group Intae Yu*, Junghyun Kim, Ilsung.
CDF Offline Production Farms Stephen Wolbers for the CDF Production Farms Group May 30, 2001.
21 st October 2002BaBar Computing – Stephen J. Gowdy 1 Of 25 BaBar Computing Stephen J. Gowdy BaBar Computing Coordinator SLAC 21 st October 2002 Second.
6/26/01High Throughput Linux Clustering at Fermilab--S. Timm 1 High Throughput Linux Clustering at Fermilab Steven C. Timm--Fermilab.
The BaBar Event Building and Level-3 Trigger Farm Upgrade S.Luitz, R. Bartoldus, S. Dasu, G. Dubois-Felsmann, B. Franek, J. Hamilton, R. Jacobsen, D. Kotturi,
9 February 2000CHEP2000 Paper 3681 CDF Data Handling: Resource Management and Tests E.Buckley-Geer, S.Lammel, F.Ratnikov, T.Watts Hardware and Resources.
EGEE is a project funded by the European Union under contract IST HEP Use Cases for Grid Computing J. A. Templon Undecided (NIKHEF) Grid Tutorial,
CDF Production Farms Yen-Chu Chen, Roman Lysak, Stephen Wolbers May 29, 2003.
Management of the LHCb DAQ Network Guoming Liu * †, Niko Neufeld * * CERN, Switzerland † University of Ferrara, Italy.
Next Generation Operating Systems Zeljko Susnjar, Cisco CTG June 2015.
Online Software 8-July-98 Commissioning Working Group DØ Workshop S. Fuess Objective: Define for you, the customers of the Online system, the products.
Lee Lueking 1 The Sequential Access Model for Run II Data Management and Delivery Lee Lueking, Frank Nagy, Heidi Schellman, Igor Terekhov, Julie Trumbo,
The KLOE computing environment Nuclear Science Symposium Portland, Oregon, USA 20 October 2003 M. Moulson – INFN/Frascati for the KLOE Collaboration.
2003 Conference for Computing in High Energy and Nuclear Physics La Jolla, California Giovanna Lehmann - CERN EP/ATD The DataFlow of the ATLAS Trigger.
Sep. 17, 2002BESIII Review Meeting BESIII DAQ System BESIII Review Meeting IHEP · Beijing · China Sep , 2002.
HIGUCHI Takeo Department of Physics, Faulty of Science, University of Tokyo Representing dBASF Development Team BELLE/CHEP20001 Distributed BELLE Analysis.
Kraków4FutureDaQ Institute of Physics & Nowoczesna Elektronika P.Salabura,A.Misiak,S.Kistryn,R.Tębacz,K.Korcyl & M.Kajetanowicz Discrete event simulations.
June 17th, 2002Gustaaf Brooijmans - All Experimenter's Meeting 1 DØ DAQ Status June 17th, 2002 S. Snyder (BNL), D. Chapin, M. Clements, D. Cutts, S. Mattingly.
Online Monitoring for the CDF Run II Experiment T.Arisawa, D.Hirschbuehl, K.Ikado, K.Maeshima, H.Stadie, G.Veramendi, W.Wagner, H.Wenzel, M.Worcester MAR.
Integration of the ATLAS Tag Database with Data Management and Analysis Components Caitriana Nicholson University of Glasgow 3 rd September 2007 CHEP,
1 EIR Nov 4-8, 2002 DAQ and Online WBS 1.3 S. Fuess, Fermilab P. Slattery, U. of Rochester.
UTA MC Production Farm & Grid Computing Activities Jae Yu UT Arlington DØRACE Workshop Feb. 12, 2002 UTA DØMC Farm MCFARM Job control and packaging software.
CPU on the farms Yen-Chu Chen, Roman Lysak, Miro Siket, Stephen Wolbers June 4, 2003.
MC Production in Canada Pierre Savard University of Toronto and TRIUMF IFC Meeting October 2003.
Management of the LHCb DAQ Network Guoming Liu *†, Niko Neufeld * * CERN, Switzerland † University of Ferrara, Italy.
The MEG Offline Project General Architecture Offline Organization Responsibilities Milestones PSI 2/7/2004Corrado Gatto INFN.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
D0 Farms 1 D0 Run II Farms M. Diesburg, B.Alcorn, J.Bakken, R. Brock,T.Dawson, D.Fagan, J.Fromm, K.Genser, L.Giacchetti, D.Holmgren, T.Jones, T.Levshina,
CODA Graham Heyes Computer Center Director Data Acquisition Support group leader.
Markus Frank (CERN) & Albert Puig (UB).  An opportunity (Motivation)  Adopted approach  Implementation specifics  Status  Conclusions 2.
Hans Wenzel CDF CAF meeting October 18 th -19 th CMS Computing at FNAL Hans Wenzel Fermilab  Introduction  CMS: What's on the floor, How we got.
Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.
A Data Handling System for Modern and Future Fermilab Experiments Robert Illingworth Fermilab Scientific Computing Division.
10/18/01Linux Reconstruction Farms at Fermilab 1 Steven C. Timm--Fermilab.
Jianming Qian, UM/DØ Software & Computing Where we are now Where we want to go Overview Director’s Review, June 5, 2002.
Fermilab Scientific Computing Division Fermi National Accelerator Laboratory, Batavia, Illinois, USA. Off-the-Shelf Hardware and Software DAQ Performance.
Bernd Panzer-Steindel CERN/IT/ADC1 Medium Term Issues for the Data Challenges.
1 P. Murat, Mini-review of the CDF Computing Plan 2006, 2005/10/18 An Update to the CDF Offline Plan and FY2006 Budget ● Outline: – CDF computing model.
CMS High Level Trigger Configuration Management
The ATLAS Computing Model
Designing a PC Farm to Simultaneously Process Separate Computations Through Different Network Topologies Patrick Dreher MIT.
Presentation transcript:

Stephen Wolbers CHEP2000 February 7-11, 2000 Stephen Wolbers CHEP2000 February 7-11, 2000 CDF Farms Group: Jaroslav Antos, Antonio Chan, Paoti Chang, Yen-Chu Chen, Stephen Wolbers, GP Yeh, Ping Yeh Fermilab Computing Division: Mark Breitung, Troy Dawson, Jim Fromm, Lisa Giacchetti, Tanya Levshina, Igor Mandrichenko, Ray Pasetes, Marilyn Schweitzer, Karen Shepelak, Dane Skow CDF Farms in Run II

Stephen Wolbers CHEP2000 February 7-11, 2000 Outline Requirements for Run 2 computing Design Experience, including Mock Data Challenge 1 Future -- MDC 2, Run 2, Higher rates and scaling

Stephen Wolbers CHEP2000 February 7-11, 2000 Requirements The CDF farms must have sufficient capacity for Run 2 Raw Data Reconstruction The farms also must provide capacity for any reprocessing needs Farms must be easy to configure and run The bookkeeping must be clear and easy to use Error handling must be excellent

Stephen Wolbers CHEP2000 February 7-11, 2000 Requirements Capacity –Rates : 75 Hz max (28 Hz average) 250 Kbyte input event size 60 Kbyte output event size –Translates to 20 Mbyte/s input (max) and 5 Mbyte/sec output (max) –Substantially greater than Run I but not overwhelming in modern architectures

Stephen Wolbers CHEP2000 February 7-11, 2000 Requirements (CPU) CPU goal is <5 seconds/event on PIII/500 Assuming 70% efficiency this translates to –200 PIII/500 equivalents –4200 SpecInt95 Adding in reprocessing, simulation, responding to peak rates – PIII/500 equivalents ( duals) – SpecInt95

Stephen Wolbers CHEP2000 February 7-11, 2000 Design/Model Hardware –Choose the most cost-effective CPU’s for the compute-intensive computing –This is currently the dual-Pentium architecture –Network is fast and gigabit ethernet, with all machines being connected to a single or at most two large switches –A large I/O system to handle the buffering of data to/from mass storage and to provide a place to split the data into physics datasets

Stephen Wolbers CHEP2000 February 7-11, 2000 Simple Model

Stephen Wolbers CHEP2000 February 7-11, 2000 Run II CDF PC Farm

Stephen Wolbers CHEP2000 February 7-11, 2000 Software Model Software consists of independent modules –Well defined interfaces –Common bookkeeping –Standardized error handling Choices –Python –MySQL database (internal database) –FBS (Farms Batch System) * –FIPC (Farms Interprocessor Communication) * –CDF Data Handling Software * –* Discussed in other CHEP talks

Stephen Wolbers CHEP2000 February 7-11, 2000

Stephen Wolbers CHEP2000 February 7-11, 2000 List of Software Modules Coordinator Stager Dispatcher Reconstructor Collector Splitter Spooler Exporter Database Retriever Disk Manager Tape Manager Worker Node Manager Bookkeeper Messenger

Stephen Wolbers CHEP2000 February 7-11, 2000 Physics Analysis Requirements and Impact Raw Data Files come in ~8 flavors, or streams –1 Gbyte input files Reconstruction produces inclusive summary files –250 Mbyte output files Output Files must be split into ~8 physics datasets per input stream –Target 1 Gbyte files –About 20% overlap Leads to a complicated splitting/concatenation problem, as input and output streams range from tiny (<few percent) to quite large (10’s of percent)

Stephen Wolbers CHEP2000 February 7-11, 2000 A A1 A2 A3 A4 A5 A6 A7 Farms Input Stream (x8)

Stephen Wolbers CHEP2000 February 7-11, 2000 Prototype Farm/Mock Data Challenge 1 A small prototype farm has been used to test software, study hardware performance, and provide small but significant CPU resources 4 I/O + 14 worker PII/400 dual Linux PC’s Gigabit ethernet (I/O nodes/prototype farm) 100 Mbit ethernet (worker nodes and SGI) Useful and necessary step before scaling up to larger farms Expected data rates were achieved (20 Mbyte/s aggregate data transfer)

Stephen Wolbers CHEP2000 February 7-11, 2000 Prototype Farm I/O Nodes Worker nodes Console

Stephen Wolbers CHEP2000 February 7-11, 2000 Mock Data Challenge 1 (CDF) Primarily a connectivity test, with the following components: –Level 3 (Monte Carlo input) –Data Logger –Tape robot/mass storage system –File catalog (database) –Production executable (with all detector components and reconstruction) –Farm I/O + worker nodes –CDF Data Handling System

Stephen Wolbers CHEP2000 February 7-11, 2000 Mock Data Challenge 1 Monte Carlo events –2 input streams –6 output streams –1 Gbyte files, trigger bits set in L3 for splitting events Data Flow –Approx. 100 Gbyte of data was generated –Processed through L3 using L3 executable –Logged and processed on the farms through the full offline reconstruction package

Stephen Wolbers CHEP2000 February 7-11, 2000 MDC1 - Data Flow Tape Robot L3 MC Files Logger (Local) Logger (FCC) Farms File Catalog

Stephen Wolbers CHEP2000 February 7-11, 2000 MDC1 Farms I/O Node SGI O2000 Control Node (PC) Worker Nodes Tapedrives

Stephen Wolbers CHEP2000 February 7-11, 2000 Future 50 new PC’s are in place (PIII/500 duals) (Acquisition and testing was interesting) New I/O node is being acquired (SGI O2200) Plan to integrate I/O systems, Cisco 6509 switch, 50 PC’s Prepare for rate test and MDC2 (April/May 2000) Use same system for CDF Engineering Run in Fall 2000

Stephen Wolbers CHEP2000 February 7-11, 2000

Stephen Wolbers CHEP2000 February 7-11, 2000 Future (cont.) Run II begins March 1, 2001 Farms must grow to accommodate the expected data rate All PC’s will be purchased as late as possible –PC’s will increase from 50 to 150 or more –I/O systems should be adequate for full Run II rate –Switch should have sufficient capacity

Stephen Wolbers CHEP2000 February 7-11, 2000 Future (software and process) Software must be completed, debugged, and made user-friendly and supportable. CDF experimenters will monitor the farm. CDF farm experts will debug, tune and watch over the farm. Looking for a smooth, easy to run system –Must run for many years –Would like it to run with little intervention

Stephen Wolbers CHEP2000 February 7-11, 2000 Far Future/possible scaling System should scale beyond design. Can add I/O capacity by increasing disk storage, tapedrives, CPU speed, adding Gbit ethernet or 10 Gbit ethernet when available. More PC’s can be purchased. The switch has capacity. If all else fails the entire system can simply be cloned, but overall control and database access is a potential issue in this case.

Stephen Wolbers CHEP2000 February 7-11, 2000 Conclusion The CDF farm is rapidly coming together: –Hardware –Software design and implementation Capacity for large-scale tests is almost in place. Plans for full Run II rate are firm. Upgrade path to go beyond Run II nominal rates exists.