Central Reconstruction System on the RHIC Linux Farm in Brookhaven Laboratory HEPIX - BNL October 19, 2004 Tomasz Wlodek - BNL.

Slides:



Advertisements
Similar presentations
The RHIC-ATLAS Computing Facility at BNL HEPIX – Edinburgh May 24-28, 2004 Tony Chan RHIC Computing Facility Brookhaven National Laboratory.
Advertisements

Site Report: The Linux Farm at the RCF HEPIX-HEPNT October 22-25, 2002 Ofer Rind RHIC Computing Facility Brookhaven National Laboratory.
Operating System.
CERN LCG Overview & Scaling challenges David Smith For LCG Deployment Group CERN HEPiX 2003, Vancouver.
1 Generic logging layer for the distributed computing by Gene Van Buren Valeri Fine Jerome Lauret.
Parasol Architecture A mild case of scary asynchronous system stuff.
McFarm: first attempt to build a practical, large scale distributed HEP computing cluster using Globus technology Anand Balasubramanian Karthik Gopalratnam.
Computing Lectures Introduction to Ganga 1 Ganga: Introduction Object Orientated Interactive Job Submission System –Written in python –Based on the concept.
Setting up of condor scheduler on computing cluster Raman Sehgal NPD-BARC.
A Computation Management Agent for Multi-Institutional Grids
Status of the new CRS software (update) Tomasz Wlodek June 22, 2003.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
Workload Management Workpackage Massimo Sgaravatto INFN Padova.
The new The new MONARC Simulation Framework Iosif Legrand  California Institute of Technology.
Workload Management Massimo Sgaravatto INFN Padova.
The Difficulties of Distributed Data Douglas Thain Condor Project University of Wisconsin
CERN - IT Department CH-1211 Genève 23 Switzerland t Monitoring the ATLAS Distributed Data Management System Ricardo Rocha (CERN) on behalf.
Operating System. Architecture of Computer System Hardware Operating System (OS) Programming Language (e.g. PASCAL) Application Programs (e.g. WORD, EXCEL)
Alexandre A. P. Suaide VI DOSAR workshop, São Paulo, 2005 STAR grid activities and São Paulo experience.
Standard Grade Computing System Software & Operating Systems.
The SAMGrid Data Handling System Outline:  What Is SAMGrid?  Use Cases for SAMGrid in Run II Experiments  Current Operational Load  Stress Testing.
The SLAC Cluster Chuck Boeheim Assistant Director, SLAC Computing Services.
03/27/2003CHEP20031 Remote Operation of a Monte Carlo Production Farm Using Globus Dirk Hufnagel, Teela Pulliam, Thomas Allmendinger, Klaus Honscheid (Ohio.
Ofer Rind - RHIC Computing Facility Site Report The RHIC Computing Facility at BNL HEPIX-HEPNT Vancouver, BC, Canada October 20, 2003 Ofer Rind RHIC Computing.
November 7, 2001Dutch Datagrid SARA 1 DØ Monte Carlo Challenge A HEP Application.
Farm Management D. Andreotti 1), A. Crescente 2), A. Dorigo 2), F. Galeazzi 2), M. Marzolla 3), M. Morandin 2), F.
An Overview of PHENIX Computing Ju Hwan Kang (Yonsei Univ.) and Jysoo Lee (KISTI) International HEP DataGrid Workshop November 8 ~ 9, 2002 Kyungpook National.
Introduction to U.S. ATLAS Facilities Rich Baker Brookhaven National Lab.
Jefferson Lab Site Report Kelvin Edwards Thomas Jefferson National Accelerator Facility Newport News, Virginia USA
Invitation to Computer Science 5 th Edition Chapter 6 An Introduction to System Software and Virtual Machine s.
CHEP Sep Andrey PHENIX Job Submission/Monitoring in transition to the Grid Infrastructure Andrey Y. Shevel, Barbara Jacak,
Developing & Managing A Large Linux Farm – The Brookhaven Experience CHEP2004 – Interlaken September 27, 2004 Tomasz Wlodek - BNL.
Status of Grid-enabled UTA McFarm software Tomasz Wlodek University of the Great State of TX At Arlington.
The GRID and the Linux Farm at the RCF HEPIX – Amsterdam HEPIX – Amsterdam May 19-23, 2003 May 19-23, 2003 A. Chan, R. Hogue, C. Hollowell, O. Rind, A.
EGEE is a project funded by the European Union under contract IST HEP Use Cases for Grid Computing J. A. Templon Undecided (NIKHEF) Grid Tutorial,
Condor Usage at Brookhaven National Lab Alexander Withers (talk given by Tony Chan) RHIC Computing Facility Condor Week - March 15, 2005.
The GRID and the Linux Farm at the RCF CHEP 2003 – San Diego CHEP 2003 – San Diego March 27, 2003 March 27, 2003 A. Chan, R. Hogue, C. Hollowell, O. Rind,
PPDG update l We want to join PPDG l They want PHENIX to join NSF also wants this l Issue is to identify our goals/projects Ingredients: What we need/want.
Operating System Principles And Multitasking
STAR Collaboration, July 2004 Grid Collector Wei-Ming Zhang Kent State University John Wu, Alex Sim, Junmin Gu and Arie Shoshani Lawrence Berkeley National.
David Adams ATLAS DIAL: Distributed Interactive Analysis of Large datasets David Adams BNL August 5, 2002 BNL OMEGA talk.
AliEn AliEn at OSC The ALICE distributed computing environment by Bjørn S. Nilsen The Ohio State University.
CIS250 OPERATING SYSTEMS Chapter One Introduction.
UTA MC Production Farm & Grid Computing Activities Jae Yu UT Arlington DØRACE Workshop Feb. 12, 2002 UTA DØMC Farm MCFARM Job control and packaging software.
RHIC/US ATLAS Tier 1 Computing Facility Site Report Christopher Hollowell Physics Department Brookhaven National Laboratory HEPiX Upton,
January 30, 2016 RHIC/USATLAS Computing Facility Overview Dantong Yu Brookhaven National Lab.
Learning Objectives Understand the concepts of Information systems.
AliRoot survey: Analysis P.Hristov 11/06/2013. Are you involved in analysis activities?(85.1% Yes, 14.9% No) 2 Involved since 4.5±2.4 years Dedicated.
A Service-Based SLA Model HEPIX -- CERN May 6, 2008 Tony Chan -- BNL.
Data Analysis w ith PROOF, PQ2, Condor Data Analysis w ith PROOF, PQ2, Condor Neng Xu, Wen Guan, Sau Lan Wu University of Wisconsin-Madison 30-October-09.
CCJ introduction RIKEN Nishina Center Kohei Shoji.
A Data Handling System for Modern and Future Fermilab Experiments Robert Illingworth Fermilab Scientific Computing Division.
BIG DATA/ Hadoop Interview Questions.
Apr. 25, 2002Why DØRAC? DØRAC FTFM, Jae Yu 1 What do we want DØ Regional Analysis Centers (DØRAC) do? Why do we need a DØRAC? What do we want a DØRAC do?
PROTO-GRID Status of Grid-enabled UTA McFarm software Tomasz Wlodek University of the Great State of Texas At Arlington.
1 OPERATING SYSTEMS. 2 CONTENTS 1.What is an Operating System? 2.OS Functions 3.OS Services 4.Structure of OS 5.Evolution of OS.
Compute and Storage For the Farm at Jlab
Workload Management Workpackage
U.S. ATLAS Grid Production Experience
Belle II Physics Analysis Center at TIFR
Operating System.
US CMS Testbed.
TYPES OFF OPERATING SYSTEM
Nuclear Physics Data Management Needs Bruce G. Gibbard
Initial job submission and monitoring efforts with JClarens
Chapter 2: Operating-System Structures
RHIC Computing Facility Processing Systems
Chapter 2: Operating-System Structures
Overview of Computer system
Presentation transcript:

Central Reconstruction System on the RHIC Linux Farm in Brookhaven Laboratory HEPIX - BNL October 19, 2004 Tomasz Wlodek - BNL

Background Brookhaven National Laboratory (BNL) is a multi-disciplinary research laboratory funded by US government. Brookhaven National Laboratory (BNL) is a multi-disciplinary research laboratory funded by US government. BNL is the site of Relativistic Heavy Ion Collider (RHIC) and four of its experiments – Brahms, Phenix, Phobos and Star. BNL is the site of Relativistic Heavy Ion Collider (RHIC) and four of its experiments – Brahms, Phenix, Phobos and Star. The Rhic Computing Facility (RCF) was formed in the mid 90’s, in order to address computing needs of RHIC experiments. The Rhic Computing Facility (RCF) was formed in the mid 90’s, in order to address computing needs of RHIC experiments.

Background (cont.) The data collected by Rhic experiments is written to tapes in HPSS mass storage facility, to be reprocessed at a later time. The data collected by Rhic experiments is written to tapes in HPSS mass storage facility, to be reprocessed at a later time. In order to automate the process of data reconstruction a home-grown batch system (“Old CRS”) was developed. In order to automate the process of data reconstruction a home-grown batch system (“Old CRS”) was developed. “Old CRS” manages data staging from and to HPSS system, schedules jobs and monitors their execution “Old CRS” manages data staging from and to HPSS system, schedules jobs and monitors their execution Due to the growth of the size of the farm, the “old CRS” system does not scale well and needs to be replaced. Due to the growth of the size of the farm, the “old CRS” system does not scale well and needs to be replaced.

RCF facility Server racks HPSS storage

The RCF Farm Batch Systems - present Data Analysis farm (LSF batch system) Reconstruction farm (CRS system) Berlin wall

New CRS - requirements Stage input files from HPSS, Unix (and in the future – Grid) Stage input files from HPSS, Unix (and in the future – Grid) Stage output files to HPSS, Unix (and in the future – Grid) Stage output files to HPSS, Unix (and in the future – Grid) Capable of running mass production, high degree of automation Capable of running mass production, high degree of automation Error diagnostics Error diagnostics Bookkeeping Bookkeeping

Condor as the scheduler of new CRS system Condor comes with Dagman – a meta scheduler which allows to build “graphs” of interdependent batch jobs. Condor comes with Dagman – a meta scheduler which allows to build “graphs” of interdependent batch jobs. Dagman allows to construct jobs which consist of several subjobs, which perform data staging operations and data reconstruction separately … Dagman allows to construct jobs which consist of several subjobs, which perform data staging operations and data reconstruction separately … … which in turn allows us to optimize the staging of data tapes to minimize the number of tape mounts … which in turn allows us to optimize the staging of data tapes to minimize the number of tape mounts

New CRS batch system CRS job HPSS interface MySQL Logbook server HPSS

Anatomy of a CRS job Parent job (1 per input file) Parent job (1 per input file) Main job Parent job (1 per input file) Each CRS job consist of several subjobs. Parent jobs (one per each input file) are responsible for locating input Data and – if necessary – staging them from tapes to HPSS disk cache. The main job is responsible for actual data reconstruction and is executed if and only if all parent jobs completed succesfully. ……..

Lifecycle of the Main Job Import input files from NFS or HPSS to local disk Run user’s executable Export output files Check exit code, check If all required data files are present Did all parent jobs completed successfully? Perform error diagnostics and recovery. Update job databases. At all stages of execution keep track of the production status and update the job/files databases. yes no

Lifecycle of the HPSS interface subjob Check if HPSS is available Submit “stage file” request Stage successful? Notify the system about the error. Perform error diagnostics. (For example: can resubmitting the request help?) Notify CRS system Update databases no yes no yes

Logbook manager Each job should provide own logfile, for monitoring and debugging purposes. Each job should provide own logfile, for monitoring and debugging purposes. It is easy to keep a logfile of an individual job running on a single machine It is easy to keep a logfile of an individual job running on a single machine CRS jobs consist of several subjobs which run independently, on different machines at different times. CRS jobs consist of several subjobs which run independently, on different machines at different times. In order to synchronize the record keeping a dedicated logbook manager is needed which is responsible for compiling reports from individual subjobs into one, human readable, logfile. In order to synchronize the record keeping a dedicated logbook manager is needed which is responsible for compiling reports from individual subjobs into one, human readable, logfile.

CRS Database In order to keep track of the production CRS is interfaced to MySQL database In order to keep track of the production CRS is interfaced to MySQL database The database stores information about each job and subjob status, status of data files, HPSS staging requests, open data transfer links and statistics of completed jobs. The database stores information about each job and subjob status, status of data files, HPSS staging requests, open data transfer links and statistics of completed jobs.

CRS control panel

The RCF Farm – future. Reconstruction farm (Condor system) Analysis farm (Condor system)

General Remarks

Condor as batch system for HEP Condor has several nice features which make it very useful as batch system (for example DAG) Condor has several nice features which make it very useful as batch system (for example DAG) Submitting very large number of condor jobs (o(10k)) from a single node can put a very high load on submit machine, leading to potential condor meltdown. Submitting very large number of condor jobs (o(10k)) from a single node can put a very high load on submit machine, leading to potential condor meltdown. This is not a problem when many users submit jobs from many machines, but for centralized services (data reconstruction, MC production,…) this can become a very difficult issue. This is not a problem when many users submit jobs from many machines, but for centralized services (data reconstruction, MC production,…) this can become a very difficult issue.

Status of condor – cntd. People from Condor team were extremely helpful with solving Condor problems – when they were on site (in BNL). People from Condor team were extremely helpful with solving Condor problems – when they were on site (in BNL). Remote support (by ) is slow. Remote support (by ) is slow.

Management barrier So far HEP experiments were relatively small (approx 500 people) by business standards. So far HEP experiments were relatively small (approx 500 people) by business standards. Everybody knows everybody Everybody knows everybody This is going to change – next generation of experiments will have thousands of physicists This is going to change – next generation of experiments will have thousands of physicists

Management barrier – cntd. In such a big communities it will become increasingly hard to introduce new software products. In such a big communities it will become increasingly hard to introduce new software products. Users have (understandable) tendency to be conservative and they do not want any changes in the environment in which they work Users have (understandable) tendency to be conservative and they do not want any changes in the environment in which they work Convincing users to switch to a new product and aiding them in transition will become a much more complex task than inventing this product! Convincing users to switch to a new product and aiding them in transition will become a much more complex task than inventing this product!