Nurcan Ozturk University of Texas at Arlington US ATLAS Transparent Distributed Facility Workshop University of North Carolina - March 4, 2008 A Distributed.

Slides:



Advertisements
Similar presentations
Submitting jobs to the grid Argonne Jamboree January 2010 R. Yoshida Esteban Fullana.
Advertisements

Nurcan Ozturk University of Texas at Arlington for the Distributed Analysis Support Team (DAST) ATLAS Distributed Computing Technical Interchange Meeting.
David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization.
Basic Grid Job Submission Alessandra Forti 28 March 2006.
DataGrid Kimmo Soikkeli Ilkka Sormunen. What is DataGrid? DataGrid is a project that aims to enable access to geographically distributed computing power.
Israel Cluster Structure. Outline The local cluster Local analysis on the cluster –Program location –Storage –Interactive analysis & batch analysis –PBS.
1 Bridging Clouds with CernVM: ATLAS/PanDA example Wenjing Wu
Reproducible Environment for Scientific Applications (Lab session) Tak-Lon (Stephen) Wu.
Lecture 8 Configuring a Printer-using Magic Filter Introduction to IP Addressing.
DIRAC API DIRAC Project. Overview  DIRAC API  Why APIs are important?  Why advanced users prefer APIs?  How it is done?  What is local mode what.
The ATLAS Production System. The Architecture ATLAS Production Database Eowyn Lexor Lexor-CondorG Oracle SQL queries Dulcinea NorduGrid Panda OSGLCG The.
Tiesheng Dai & Steven Goldfarb US–ATLAS Muon Alignment & Calibration Software Workshop Boston University – 18 Dec 2008 Tiesheng Dai & Steven Goldfarb US–ATLAS.
BaBar WEB job submission with Globus authentication and AFS access T. Adye, R. Barlow, A. Forti, A. McNab, S. Salih, D. H. Smith on behalf of the BaBar.
AQS Web Quick Reference Guide Changing Raw Data Values Using Maintenance 1. From Main Menu, click Maintenance, Sample Values, Raw Data 2. Enter monitor.
US ATLAS Western Tier 2 Status and Plan Wei Yang ATLAS Physics Analysis Retreat SLAC March 5, 2007.
Data management at T3s Hironori Ito Brookhaven National Laboratory.
Nurcan Ozturk University of Texas at Arlington Grid User Training for Local Community TUBITAK ULAKBIM, Ankara, Turkey April 5 - 9, 2010 Overview of ATLAS.
K. Harrison CERN, 20th April 2004 AJDL interface and LCG submission - Overview of AJDL - Using AJDL from Python - LCG submission.
How to Install and Use the DQ2 User Tools US ATLAS Tier2 workshop at IU June 20, Bloomington, IN Marco Mambelli University of Chicago.
1 Overview of the Application Hosting Environment Stefan Zasada University College London.
Grid Status - PPDG / Magda / pacman Torre Wenaus BNL U.S. ATLAS Physics and Computing Advisory Panel Review Argonne National Laboratory Oct 30, 2001.
30 Jan 2009Elizabeth Gallas1 Introduction to TAGs Elizabeth Gallas Oxford ATLAS-UK Distributed Computing Tutorial January 2009.
Nadia LAJILI User Interface User Interface 4 Février 2002.
The report on the current situation of the BESIII framework zhangxiaomei maqiumei 10/3/2004.
Submitting jobs to the grid Argonne Jamboree January 2010 R. Yoshida (revised March 2010) Esteban Fullana.
INFSO-RI Enabling Grids for E-sciencE ATLAS Distributed Analysis A. Zalite / PNPI.
M. Schott (CERN) Page 1 CERN Group Tutorials CAT Tier-3 Tutorial October 2009.
Architecture and ATLAS Western Tier 2 Wei Yang ATLAS Western Tier 2 User Forum meeting SLAC April
Alexander Richards, UCL 1 Atlfast and RTT (plus DCube) Christmas Meeting 18/12/2007.
David Adams ATLAS DIAL/ADA JDL and catalogs David Adams BNL December 4, 2003 ATLAS software workshop Production session CERN.
LHCb-ATLAS GANGA Workshop, 21 April 2004, CERN 1 DIRAC Software distribution A.Tsaregorodtsev, CPPM, Marseille LHCb-ATLAS GANGA Workshop, 21 April 2004.
A PanDA Backend for the Ganga Analysis Interface J. Elmsheuser 1, D. Liko 2, T. Maeno 3, P. Nilsson 4, D.C. Vanderster 5, T. Wenaus 3, R. Walker 1 1: Ludwig-Maximilians-Universität.
EGEE is a project funded by the European Union under contract IST “Interfacing to the gLite Prototype” Andrew Maier / CERN LCG-SC2, 13 August.
AliEn AliEn at OSC The ALICE distributed computing environment by Bjørn S. Nilsen The Ohio State University.
PERFORMANCE AND ANALYSIS WORKFLOW ISSUES US ATLAS Distributed Facility Workshop November 2012, Santa Cruz.
Korea Workshop May GAE CMS Analysis (Example) Michael Thomas (on behalf of the GAE group)
Using Ganga for physics analysis Karl Harrison (University of Cambridge) ATLAS Distributed Analysis Tutorial Milano, 5-6 February 2007
2 June 20061/17 Getting started with Ganga K.Harrison University of Cambridge Tutorial on Distributed Analysis with Ganga CERN, 2.
David Adams ATLAS ATLAS-ARDA strategy and priorities David Adams BNL October 21, 2004 ARDA Workshop.
Pavel Nevski DDM Workshop BNL, September 27, 2006 JOB DEFINITION as a part of Production.
BESIII Offline Software Development Environment Ma qiumei * Development environment * Configuration & management tool * Software development.
TAGS in the Analysis Model Jack Cranshaw, Argonne National Lab September 10, 2009.
INFSO-RI Enabling Grids for E-sciencE Using of GANGA interface for Athena applications A. Zalite / PNPI.
U.S. ATLAS Facility Planning U.S. ATLAS Tier-2 & Tier-3 Meeting at SLAC 30 November 2007.
22/10/2007Software Week1 Distributed analysis user feedback (I) Carminati Leonardo Universita’ degli Studi e sezione INFN di Milano.
T3g software services Outline of the T3g Components R. Yoshida (ANL)
Distributed Physics Analysis Past, Present, and Future Kaushik De University of Texas at Arlington (ATLAS & D0 Collaborations) ICHEP’06, Moscow July 29,
ELSSISuite Services QIZHI ZHANG Argonne National Laboratory on behalf of the TAG developers group ATLAS Software and Computing Week, 4~8 April, 2011.
David Adams ATLAS ATLAS Distributed Analysis (ADA) David Adams BNL December 5, 2003 ATLAS software workshop CERN.
Finding Data in ATLAS. May 22, 2009Jack Cranshaw (ANL)2 Starting Point Questions What is the latest reprocessing of cosmics? Are there are any AOD produced.
Data Analysis w ith PROOF, PQ2, Condor Data Analysis w ith PROOF, PQ2, Condor Neng Xu, Wen Guan, Sau Lan Wu University of Wisconsin-Madison 30-October-09.
INFSO-RI Enabling Grids for E-sciencE Ganga 4 Technical Overview Jakub T. Moscicki, CERN.
A GANGA tutorial Professor Roger W.L. Jones Lancaster University.
Wouter Verkerke, NIKHEF 1 Using ‘stoomboot’ for NIKHEF-ATLAS batch computing What is ‘stoomboot’ – Hardware –16 machines, each 2x quad-core Pentium = 128.
ATLAS TIER3 in Valencia Santiago González de la Hoz IFIC – Instituto de Física Corpuscular (Valencia)
Acronyms GAS - Grid Acronym Soup, LCG - LHC Computing Project EGEE - Enabling Grids for E-sciencE.
Joe Foster 1 Two questions about datasets: –How do you find datasets with the processes, cuts, conditions you need for your analysis? –How do.
Efi.uchicago.edu ci.uchicago.edu FAX splinter session Rob Gardner Computation and Enrico Fermi Institutes University of Chicago ATLAS Tier 1 / Tier 2 /
Starting Analysis with Athena (Esteban Fullana Torregrosa) Rik Yoshida High Energy Physics Division Argonne National Laboratory.
J. Shank DOSAR Workshop LSU 2 April 2009 DOSAR Workshop VII 2 April ATLAS Grid Activities Preparing for Data Analysis Jim Shank.
Introduction to PanDA Client Tools – pathena, prun and others
A full demonstration based on a “real” analysis scenario
Extended OSG client for WLCG
The ATLAS software in the Grid Alessandro De Salvo <Alessandro
Bomgar Remote support software
ADA aodhisto transformation
Alice Software Demonstration
Introduction to Athena
AtlasSetup & Evolution
Presentation transcript:

Nurcan Ozturk University of Texas at Arlington US ATLAS Transparent Distributed Facility Workshop University of North Carolina - March 4, 2008 A Distributed Analysis Demonstrator Using pathena Analysis Queues on Tier2 Facilities

March 4, 2008 Nurcan Ozturk 2 Introduction   Goal:  Send an analysis job to the analysis queues at Tier2’s using pathena as a distributed analysis tool  Run on the FDR data for this demo  Retrieve and analyze the output   How to submit an analysis job:  Setup athena  Check out PandaTools package (for pathena)  Use HighPtView package as an analysis package  Find the FDR data  Find out which analysis queue will be used  Submit a pathena job   Monitor job’s status in PanDA monitor   Get the output of pathena job and make plots

March 4, 2008 Nurcan Ozturk 3 Setup Athena and Work Area   Instructions are given to run on acas machines at BNL   Create a directory (called pathenaDemo) and get the requirements file from:   Make a sub-directory for (called ) under pathenaDemo   Setup CMT:  source /afs/usatlas.bnl.gov/cernsw/contrib/CMT/v1r20p /mgr/setup.sh  cmt config   Setup athena for release :  source setup.sh -tag= ,32   Check out Tools/Scripts package to setup your work area (easy way of checking out and compiling multiple packages)  cd  cmt co -r Scripts Tools/Scripts   Setup work area and create run area: ./Tools/Scripts/share/setupWorkArea.py  cd WorkArea/cmt  cmt bro cmt config  cmt bro gmake  source setup.sh

March 4, 2008 Nurcan Ozturk 4 Check Out Necessary Packages   Check out PandaTools for pathena:  cd to directory  cmt co PhysicsAnalysis/DistributedAnalysis/PandaTools   Run every time new package(s) checked out: ./Tools/Scripts/share/setupWorkArea.py   It prints: WorkAreaMgr : INFO ################################################################################ WorkAreaMgr : INFO Creating a WorkArea CMT package under: [/usatlas/u/nurcan/pathenaDemo/ ] WorkAreaMgr : INFO Scanning [/usatlas/u/nurcan/pathenaDemo/ ] WorkAreaMgr : INFO Found 2 packages in WorkArea WorkAreaMgr : INFO => 0 package(s) in suppression list WorkAreaMgr : INFO Generation of WorkArea/cmt/requirements done [OK] WorkAreaMgr : INFO ################################################################################   Compile PandaTools package from WorkArea:  cd WorkArea/cmt  cmt bro cmt config  cmt bro gmake  source setup.sh   Use the HighPtView package from the release and get the jobOption file into your run area:  cd WorkArea/run  get_files HighPtViewNtuple_topOptions.py

March 4, 2008 Nurcan Ozturk 5 Setup Grid and DQ2, Find FDR Datasets   Setup Grid:  source /afs/usatlas.bnl.gov/lcg/current/etc/profile.d/grid_env.sh   Setup DQ2:  source /afs/usatlas.bnl.gov/Grid/Don-Quijote/dq2_user_client/setup.sh.BNL   Look at available FDR datasets at Tier2’s from Panda monitor:   Pick up one dataset: fdr08_run StreamEgamma.merge.AOD.o1_r6_t1   One can also list the replicas for a given dataset:  source /afs/usatlas.bnl.gov/Grid/Don-Quijote/DQ2_0_3_client/dq2.sh  dq2-list-dataset-replicas fdr08_run StreamEgamma.merge.AOD.o1_r6_t1 INCOMPLETE: COMPLETE: IJST2,TIER0TAPE,TW-FTT,CYF,DESY-HH,DESYZN,PNPI,JINR,TORON,NAPOLI, LIP-LISBON,IFICDISK,LIV,RALPP,ICL,MWT2_IU,WISC,SLACXRD,BU_DDM,MCGILL, AGLT2_SRM,SWT2_CPB,BNLXRDHDD1,INFN-T1_DATADISK,FZK-LCG2_DATADISK, TRIUMF-LCG2_DATADISK,NDGF-T1_DATADISK,PIC_DATADISK,IN2P3-LPC_DATADISK, RAL-LCG2_DATADISK, SARA-MATRIX_DATADISK,TAIWAN-LCG2_DATADISK, IN2P3-CC_DATADISK, BNL-OSG2_DATADISK

March 4, 2008 Nurcan Ozturk 6 Name Association Between DDM and Analysis Queue Names DDM NameAnalysis Queue Name SWT2_CPBANALY_SWT2_CPB OUANALY_OU_OCHEP_SWT2 AGLT2_SRMANALY_AGLT2 MWT2_UC * ANALY_MWT2 SLACXRDANALY_SLAC BU_DDMANALY_NET2 WISCANALY_GLOW-ATLAS * MWT2_UC and MWT2_IU share the FDR data, however the analysis queue is setup to use the former.

March 4, 2008 Nurcan Ozturk 7 Analysis Queues from Panda Monitor

March 4, 2008 Nurcan Ozturk 8 Run pathena (1)   Run pathena with one line command: $ pathena -c "Mode=['FullReco'];DetailLevel=['FullStandardAOD']; Branches= ['StacoTauRec']" HighPtViewNtuple_topOptions.py --inDS fdr08_run StreamEgamma.merge.AOD.o1_r6_t1 --outDS user.NurcanOzturk.pathenaDemo_StreamEgamma_SWT2_CPB_mar3 --nfiles 1 --site ANALY_SWT2_CPB   HighPtView options:  Mode=['FullReco'];DetailLevel=['FullStandardAOD']; Branches= ['StacoTauRec']"   pathena options:  Specify input dataset by --inDS  Specify output dataset by --outDS  Specify # of files to be run on by --nfiles 1  Specify the analysis queue name by --site siteName   More pathena options are available at: 

March 4, 2008 Nurcan Ozturk 9 Run pathena (2)   The following will be printed on the screen: Your identity: /DC=org/DC=doegrids/OU=People/CN=Nurcan Ozturk Enter GRID pass phrase for this identity: Creating proxy Done Your proxy is valid until: Tue Mar 4 00:50: extracting run configuration ConfigExtractor > No Input ConfigExtractor > Output=AANT EVAANtupleDump0Stream AANT0 archive sources archive InstallArea post sources/jobO query files in dataset:fdr08_run StreamEgamma.merge.AOD.o1_r6_t1 submit =================== JobID : 8202 Status : 0 > build PandaID= > run PandaID= builds the athena environment at the remote site. It produces a library dataset. runs athena and produces the output files

March 4, 2008 Nurcan Ozturk 10 Monitor Job’s Status in PanDA Monitor (1) Go to “List users” link at the right top corner of PanDA monitor:

March 4, 2008 Nurcan Ozturk 11 Monitor Job’s Status in PanDA Monitor (2)

March 4, 2008 Nurcan Ozturk 12 Monitor Job’s Status in PanDA Monitor (3)

March 4, 2008 Nurcan Ozturk 13 Retrieve Results and Make Plots   Use dq2 client tools to retrieve the output dataset:  dq2_get –rv user.NurcanOzturk.pathenaDemo_StreamEgamma_SWT2_CPB_mar3   This copies the output files:  user.NurcanOzturk.pathenaDemo_StreamEgamma_SWT2_CPB_mar3._ log.tgz  user.NurcanOzturk.pathenaDemo_StreamEgamma_SWT2_CPB_mar3.AANT0._00001.root   One particular user needed to use “–s OU” to retrieve output dataset from ANALY_OU_OCHEP_SWT2, under investigation why   Wisconsin site (ANALY_GLOW-ATLAS) added all ATLAS users to its gridmap file so that all ATLAS users can retrieve the files   Open the file in root and make some plots:  root user.NurcanOzturk.pathenaDemo_StreamEgamma_SWT2_CPB_mar3.AANT0._00001.root  root [1] FullRec0->GetListOfLeaves()->Print();  root [2] FullRec0->Draw("El_N", "El_N>0");  root [3] FullRec0->Draw("El_p_T", "El_N>0");  root [4] FullRec0->Draw("Jet_C4_N", "Jet_C4_N>0");  root [5] FullRec0->Draw("Jet_C4_p_T", "Jet_C4_N>0");

March 4, 2008 Nurcan Ozturk 14 Some Plots

March 4, 2008 Nurcan Ozturk 15 Future Developments with pathena   Near term PanDA activities and plans were presented by Torre Wenaus at the Software&Computing Workshop last week. Among others related to analysis:  Automatic redirection of analysis jobs within a cloud  Namely, no need to specify site - pathena will choose the best site based on data availability and available CPU's

March 4, 2008 Nurcan Ozturk 16 References   Athena software releases and how to use them:    FDR datasets available at Tier2’s:    pathena wiki page “Distributed Analysis on Panda”:    How to submit same pathena job on multiple datasets:  e_same_ana   HighPtView wiki page:    Wiki pages by Akira Shibata:  