3rd June 2004 CDF Grid SAM:Metadata and Middleware Components Mòrag Burgon-Lyon University of Glasgow.

Slides:



Advertisements
Similar presentations
GridPP July 2003Stefan StonjekSlide 1 SAM middleware components Stefan Stonjek University of Oxford 7 th GridPP Meeting 02 nd July 2003 Oxford.
Advertisements

Physics with SAM-Grid Stefan Stonjek University of Oxford 6 th GridPP Meeting 30 th January 2003 Coseners House.
CMS Applications Towards Requirements for Data Processing and Analysis on the Open Science Grid Greg Graham FNAL CD/CMS for OSG Deployment 16-Dec-2004.
S. Gadomski, "ATLAS computing in Geneva", journee de reflexion, 14 Sept ATLAS computing in Geneva Szymon Gadomski description of the hardware the.
David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization.
K.Harrison CERN, 23rd October 2002 HOW TO COMMISSION A NEW CENTRE FOR LHCb PRODUCTION - Overview of LHCb distributed production system - Configuration.
JIM Deployment for the CDF Experiment M. Burgon-Lyon 1, A. Baranowski 2, V. Bartsch 3,S. Belforte 4, G. Garzoglio 2, R. Herber 2, R. Illingworth 2, R.
The SAM-Grid Fabric Services Gabriele Garzoglio (for the SAM-Grid team) Computing Division Fermilab.
QCDgrid Technology James Perry, George Beckett, Lorna Smith EPCC, The University Of Edinburgh.
Research on cloud computing application in the peer-to-peer based video-on-demand systems Speaker : 吳靖緯 MA0G rd International Workshop.
YuChul Yang kr October 21, 2005The Korean Physical Society The Current Status of CDF Grid 양유철 *, 한대희, 공대정, 김지은, 서준석, 장성현, 조기현, 오영도,
The SAMGrid Data Handling System Outline:  What Is SAMGrid?  Use Cases for SAMGrid in Run II Experiments  Current Operational Load  Stress Testing.
LcgCAF:CDF submission portal to LCG Federica Fanzago for CDF-Italian Computing Group Gabriele Compostella, Francesco Delli Paoli, Donatella Lucchesi, Daniel.
Remote Production and Regional Analysis Centers Iain Bertram 24 May 2002 Draft 1 Lancaster University.
5 November 2001GridPP Collaboration Meeting1 CDF and the Grid Requirements and Anti-Requirements CDF-o-Centric View Proposal Conclusion: CDF/D0 Deliverables.
Grid Job and Information Management (JIM) for D0 and CDF Gabriele Garzoglio for the JIM Team.
03/27/2003CHEP20031 Remote Operation of a Monte Carlo Production Farm Using Globus Dirk Hufnagel, Teela Pulliam, Thomas Allmendinger, Klaus Honscheid (Ohio.
YuChul Yang Oct KPS 2006 가을 EXCO, 대구 The Current Status of KorCAF and CDF Grid 양유철, 장성현, 미안 사비르 아메드, 칸 아딜, 모하메드 아즈말, 공대정, 김지은,
Building a distributed software environment for CDF within the ESLEA framework V. Bartsch, M. Lancaster University College London.
CDF Grid Status Stefan Stonjek 05-Jul th GridPP meeting / Durham.
3rd Nov 2000HEPiX/HEPNT CDF-UK MINI-GRID Ian McArthur Oxford University, Physics Department
D0 SAM – status and needs Plagarized from: D0 Experiment SAM Project Fermilab Computing Division.
QCDGrid Progress James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.
Jean-Yves Nief CC-IN2P3, Lyon HEPiX-HEPNT, Fermilab October 22nd – 25th, 2002.
CHEP 2003Stefan Stonjek1 Physics with SAM-Grid Stefan Stonjek University of Oxford CHEP th March 2003 San Diego.
Nick Brook Current status Future Collaboration Plans Future UK plans.
The Texas High Energy Grid (THEGrid) A Proposal to Build a Cooperative Data and Computing Grid for High Energy Physics and Astrophysics in Texas Alan Sill,
1 st December 2003 JIM for CDF 1 JIM and SAMGrid for CDF Mòrag Burgon-Lyon University of Glasgow.
A Design for KCAF for CDF Experiment Kihyeon Cho (CHEP, Kyungpook National University) and Jysoo Lee (KISTI, Supercomputing Center) The International Workshop.
Interactive Job Monitor: CafMon kill CafMon tail CafMon dir CafMon log CafMon top CafMon ps LcgCAF: CDF submission portal to LCG resources Francesco Delli.
SAMGrid as a Stakeholder of FermiGrid Valeria Bartsch Computing Division Fermilab.
CDF Grid at KISTI 정민호, 조기현 *, 김현우, 김동희 1, 양유철 1, 서준석 1, 공대정 1, 김지은 1, 장성현 1, 칸 아딜 1, 김수봉 2, 이재승 2, 이영장 2, 문창성 2, 정지은 2, 유인태 3, 임 규빈 3, 주경광 4, 김현수 5, 오영도.
ATLAS and GridPP GridPP Collaboration Meeting, Edinburgh, 5 th November 2001 RWL Jones, Lancaster University.
GridPP18 Glasgow Mar 07 DØ – SAMGrid Where’ve we come from, and where are we going? Evolution of a ‘long’ established plan Gavin Davies Imperial College.
International Workshop on HEP Data Grid Nov 9, 2002, KNU Data Storage, Network, Handling, and Clustering in CDF Korea group Intae Yu*, Junghyun Kim, Ilsung.
Status of the LHCb MC production system Andrei Tsaregorodtsev, CPPM, Marseille DataGRID France workshop, Marseille, 24 September 2002.
22 nd September 2003 JIM for CDF 1 JIM and SAMGrid for CDF Mòrag Burgon-Lyon University of Glasgow.
São Paulo Regional Analysis Center SPRACE Status Report 22/Aug/2006 SPRACE Status Report 22/Aug/2006.
Dzero MC production on LCG How to live in two worlds (SAM and LCG)
16 September GridPP 5 th Collaboration Meeting D0&CDF SAM and The Grid Act I: Grid, Sam and Run II Rick St. Denis – Glasgow University Act II: Sam4CDF.
09/02 ID099-1 September 9, 2002Grid Technology Panel Patrick Dreher Technical Panel Discussion: Progress in Developing a Web Services Data Analysis Grid.
What is SAM-Grid? Job Handling Data Handling Monitoring and Information.
SAM - Sequential Data Access via Metadata Schema Metadata Functionality Workshop Glasgow University April 26-28,2004.
Metadata Mòrag Burgon-Lyon University of Glasgow.
Outline: Tasks and Goals The analysis (physics) Resources Needed (Tier1) A. Sidoti INFN Pisa.
Evolution of a High Performance Computing and Monitoring system onto the GRID for High Energy Experiments T.L. Hsieh, S. Hou, P.K. Teng Academia Sinica,
GridPP11 Liverpool Sept04 SAMGrid GridPP11 Liverpool Sept 2004 Gavin Davies Imperial College London.
Condor Week 2004 The use of Condor at the CDF Analysis Farm Presented by Sfiligoi Igor on behalf of the CAF group.
HIGUCHI Takeo Department of Physics, Faulty of Science, University of Tokyo Representing dBASF Development Team BELLE/CHEP20001 Distributed BELLE Analysis.
1 e-Science AHM st Aug – 3 rd Sept 2004 Nottingham Distributed Storage management using SRB on UK National Grid Service Manandhar A, Haines K,
UTA MC Production Farm & Grid Computing Activities Jae Yu UT Arlington DØRACE Workshop Feb. 12, 2002 UTA DØMC Farm MCFARM Job control and packaging software.
Outline: Status: Report after one month of Plans for the future (Preparing Summer -Fall 2003) (CNAF): Update A. Sidoti, INFN Pisa and.
DCAF (DeCentralized Analysis Farm) Korea CHEP Fermilab (CDF) KorCAF (DCAF in Korea) Kihyeon Cho (CHEP, KNU) (On the behalf of HEP Data Grid Working Group)
DCAF(DeCentralized Analysis Farm) for CDF experiments HAN DaeHee*, KWON Kihwan, OH Youngdo, CHO Kihyeon, KONG Dae Jung, KIM Minsuk, KIM Jieun, MIAN shabeer,
International Workshop on HEP Data Grid Aug 23, 2003, KNU Status of Data Storage, Network, Clustering in SKKU CDF group Intae Yu*, Joong Seok Chae Department.
Distributed Physics Analysis Past, Present, and Future Kaushik De University of Texas at Arlington (ATLAS & D0 Collaborations) ICHEP’06, Moscow July 29,
1 5/4/05 Fermilab Mass Storage Enstore, dCache and SRM Michael Zalokar Fermilab.
Simulation Production System Science Advisory Committee Meeting UW-Madison March 1 st -2 nd 2007 Juan Carlos Díaz Vélez.
CDF SAM Deployment Status Doug Benjamin Duke University (for the CDF Data Handling Group)
Open Science Grid Consortium Storage on Open Science Grid Placing, Using and Retrieving Data on OSG Resources Abhishek Singh Rana OSG Users Meeting July.
A Web Based Job Submission System for a Physics Computing Cluster David Jones IOP Particle Physics 2004 Birmingham 1.
DØ Grid Computing Gavin Davies, Frédéric Villeneuve-Séguier Imperial College London On behalf of the DØ Collaboration and the SAMGrid team The 2007 Europhysics.
Overview of the Belle II computing
Belle II Physics Analysis Center at TIFR
Patrick Dreher Research Scientist & Associate Director
DØ MC and Data Processing on the Grid
Gridifying the LHCb Monte Carlo production system
Data Processing for CDF Computing
Presentation transcript:

3rd June 2004 CDF Grid SAM:Metadata and Middleware Components Mòrag Burgon-Lyon University of Glasgow

3rd June 2004 CDF Grid Contents CDF Computing Goals SAM CAF DCAF JIM How it all fits together SAM TV

3rd June 2004 CDF Grid CDF Computing Goals The CDF experiment intend to have: –25% of computing offsite by June 2004 –50% by June 2005 To achieve these goals several components are being developed and deployed: –SAM – data handling system –CAF & DCAF – batch systems –JIM – Grid extension to SAM –SAM TV – monitoring for SAM Stations

3rd June 2004 CDF Grid SAM Sequential Access via Metadata Mature data handling system Users can start SAM projects, e.g. running AC++Dump. Large volumes of files (in datasets) may be requested by SAM and are processed by the SAM projects. These are transferred from either the main cache at Fermilab, or from neighbouring SAM stations.

3rd June 2004 CDF Grid CAF The original CDF Analysis Farm The CAF is a 600 CPU farm of computers running Linux Access to the CDF data handling system and databases to allow CDF collaborators to run batch analysis jobs. Since standard Unix accounts are not created for users (i.e. you cannot ``log into'' the CAF), custom software provides remote job submission, control, monitoring, and output interface for the user Strongly authenticated via kerberos.

3rd June 2004 CDF Grid CAF Users compile and link their analysis jobs on their desktop. The required files are archived into a temporary tar file and copied to the CAF head node. Jobs are executed using a distributed batch system Farm Batch System Next Generation (FBSNG) Output is tarred up and either received back on the users desktop or saved to scratch space on the CAF FTP server, for later retrieval. A cdfsoft installation is required to submit jobs. Two 8-way Linux SMP systems are provided for users without cdfsoft on their local desktops, and for general reference for users having problems with their local installations.

3rd June 2004 CDF Grid CAF

3rd June 2004 CDF Grid CAF Initially configured to favour large reads and small writes (e.g. produce small skims, histograms, etc from official secondary datasets). Extensions have been made to allow users to store their output files back into the SAM data handling system allowing jobs with larger writes to run easily. CAF has also been used for large-scale Monte Carlo and tertiary data set production. Users typically use CAF GUI, though command line job submissions are also possible.

3rd June 2004 CDF Grid CAF Monitoring

3rd June 2004 CDF Grid DCAF Decentralised CDF Analysis Farm CAF implemented at several remote sites from Taiwan to Canada Rollout began in January 2004 Core set of 6 DCAF sites provide backbone New sites continually being added User selects site on which to run

3rd June 2004 CDF Grid DCAF Hardware Resources site GHz now TB now GHz Summer TB Summer Notes INFN Priority to INFN users; Pinned data sets exist Taiwan Pinned data sets exist Korea Running MC only now UCSD Pools resources from several US groups. Min guaranteed from x2 larger farm (CDF+CMS) Rutgers In-kind, will do MC production TTU DCAFs, test site + CDF+CMS cluster Germany GridKa ~20016~24018 Min. guaranteed CPU from x8 larger pool. Open to all by ~Dec (JIM) Canada In-kind, doing MC production, + common pool Japan Under construction Cantabria ~1 month away MIT ~1 month away UK Open to all by ~Dec (JIM), + common pool

3rd June 2004 CDF Grid DCAF Recent DCAF report (1 st June): –Taiwan DCAF has finished copying and pinning 3 large muon datasets with no major problems. –Request for ~600GHz of MC production for June has been received. –Storing MC results in a timely way was a priority. –The MC producers have been educated in storage of files through SAM (web-pages, tutorials), requiring only the CDF dataset name or MC request ID. –Request for ~600GHz of MC production for June has been received.

3rd June 2004 CDF Grid JIM Job and Information Management Grid extension to SAM allowing users to submit jobs using a local thin client. Remote broker assigns each job to an execution site based on where the most data is present and the queue is the shortest. Job progress can be monitored through a web page. Job output can be downloaded from using a web browser.

3rd June 2004 CDF Grid JIM

3rd June 2004 CDF Grid JIM JIM can run on shared resources, and can interface with most batch systems CDF environment can be tar-balled, for running Monte Carlo on non-CDF equipment. D0 have successfully run large Monte Carlo CDF Monte Carlo has been run interactively on D0 cluster. Next step is JIM submission.

3rd June 2004 CDF Grid How it all fits together

3rd June 2004 CDF Grid SAM TV Adam Lyon at Fermilab has created a set of web pages that can be used to monitor SAM stations and projects. Demo: /samTV.html

3rd June 2004 CDF Grid SAM TV Snapshot summaries – lists the stations with a pie- chart showing the number of file transfers. SAM project snapshot – all the projects on the selected station with a plot of file delivery/time. Project details – including time and plot of last file delivery Consumer and process – consumer and process Ids, application, node, user, etc. Files – list of files desired by a project

3rd June 2004 CDF Grid SAM TV

3rd June 2004 CDF Grid Challenges and Future Work Implementation and rollout of JIM for MC More DCAF installations Encourage user migration Solve fragmented disks and caches problem (suggestions welcome!)