PanDA Integration with the SLAC Pipeline Torre Wenaus, BNL BigPanDA Workshop October 21, 2013.

Slides:



Advertisements
Similar presentations
EGEE-II INFSO-RI Enabling Grids for E-sciencE The gLite middleware distribution OSG Consortium Meeting Seattle,
Advertisements

Testing PanDA at ORNL Danila Oleynik University of Texas at Arlington / JINR PanDA UTA 3-4 of September 2013.
CONDOR DAGMan and Pegasus Selim Kalayci Florida International University 07/28/2009 Note: Slides are compiled from various TeraGrid Documentations.
The ATLAS Production System. The Architecture ATLAS Production Database Eowyn Lexor Lexor-CondorG Oracle SQL queries Dulcinea NorduGrid Panda OSGLCG The.
IPlant Collaborative Tools and Services Workshop iPlant Collaborative Tools and Services Workshop Collaborating with iPlant.
Rsv-control Marco Mambelli – Site Coordination meeting October 1, 2009.
DESC mtg U Penn June, 2012 Computing Infrastructure Computing Parallel Session R.Dubois
| nectar.org.au NECTAR TRAINING Module 5 The Research Cloud Lifecycle.
OSG Area Coordinator’s Report: Workload Management February 9 th, 2011 Maxim Potekhin BNL
Grid Status - PPDG / Magda / pacman Torre Wenaus BNL U.S. ATLAS Physics and Computing Advisory Panel Review Argonne National Laboratory Oct 30, 2001.
PanDA Multi-User Pilot Jobs Maxim Potekhin Brookhaven National Laboratory Open Science Grid WLCG GDB Meeting CERN March 11, 2009.
GILDA testbed GILDA Certification Authority GILDA Certification Authority User Support and Training Services in IGI IGI Site Administrators IGI Users IGI.
Sep 21, 20101/14 LSST Simulations on OSG Sep 21, 2010 Gabriele Garzoglio for the OSG Task Force on LSST Computing Division, Fermilab Overview OSG Engagement.
OSG Area Coordinator’s Report: Workload Management April 20 th, 2011 Maxim Potekhin BNL
Grid job submission using HTCondor Andrew Lahiff.
MAGDA Roger Jones UCL 16 th December RWL Jones, Lancaster University MAGDA  Main authors: Wensheng Deng, Torre Wenaus Wensheng DengTorre WenausWensheng.
1 Resource Provisioning Overview Laurence Field 12 April 2015.
And Tier 3 monitoring Tier 3 Ivan Kadochnikov LIT JINR
Open Science Grid OSG CE Quick Install Guide Siddhartha E.S University of Florida.
Enabling Grids for E-sciencE System Analysis Working Group and Experiment Dashboard Julia Andreeva CERN Grid Operations Workshop – June, Stockholm.
OSG Area Coordinator’s Report: Workload Management Maxim Potekhin BNL
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
CERN Using the SAM framework for the CMS specific tests Andrea Sciabà System Analysis WG Meeting 15 November, 2007.
David Adams ATLAS DIAL/ADA JDL and catalogs David Adams BNL December 4, 2003 ATLAS software workshop Production session CERN.
Nurcan Ozturk University of Texas at Arlington US ATLAS Transparent Distributed Facility Workshop University of North Carolina - March 4, 2008 A Distributed.
Event Service Intro, plans, issues and objectives for today Torre Wenaus BNL US ATLAS S&C/PS Workshop Aug 21, 2014.
Role Based VO Authorization Services Ian Fisk Gabriele Carcassi July 20, 2005.
EGEE User Forum Data Management session Development of gLite Web Service Based Security Components for the ATLAS Metadata Interface Thomas Doherty GridPP.
Derek Wright Computer Sciences Department University of Wisconsin-Madison Condor and MPI Paradyn/Condor.
DGC Paris WP2 Summary of Discussions and Plans Peter Z. Kunszt And the WP2 team.
A PanDA Backend for the Ganga Analysis Interface J. Elmsheuser 1, D. Liko 2, T. Maeno 3, P. Nilsson 4, D.C. Vanderster 5, T. Wenaus 3, R. Walker 1 1: Ludwig-Maximilians-Universität.
Trusted Virtual Machine Images a step towards Cloud Computing for HEP? Tony Cass on behalf of the HEPiX Virtualisation Working Group October 19 th 2010.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
Network awareness and network as a resource (and its integration with WMS) Artem Petrosyan (University of Texas at Arlington) BigPanDA Workshop, CERN,
AliEn AliEn at OSC The ALICE distributed computing environment by Bjørn S. Nilsen The Ohio State University.
USATLAS deployment We currently use VOMS Role based authorization in production within USATLAS. In the VO we have defined 4 groups/roles that satisfy our.
ClearQuest XML Server with ClearCase Integration Northwest Rational User’s Group February 22, 2007 Frank Scholz Casey Stewart
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
PanDA & BigPanDA Kaushik De Univ. of Texas at Arlington BigPanDA Workshop, CERN October 21, 2013.
PERFORMANCE AND ANALYSIS WORKFLOW ISSUES US ATLAS Distributed Facility Workshop November 2012, Santa Cruz.
Korea Workshop May GAE CMS Analysis (Example) Michael Thomas (on behalf of the GAE group)
Testing and integrating the WLCG/EGEE middleware in the LHC computing Simone Campana, Alessandro Di Girolamo, Elisa Lanciotti, Nicolò Magini, Patricia.
The GridPP DIRAC project DIRAC for non-LHC communities.
K. Harrison CERN, 22nd September 2004 GANGA: ADA USER INTERFACE - Ganga release status - Job-Options Editor - Python support for AJDL - Job Builder - Python.
+ AliEn site services and monitoring Miguel Martinez Pedreira.
SPI NIGHTLIES Alex Hodgkins. SPI nightlies  Build and test various software projects each night  Provide a nightlies summary page that displays all.
OSG Area Coordinator’s Report: Workload Management Maxim Potekhin BNL May 8 th, 2008.
David Adams ATLAS ATLAS-ARDA strategy and priorities David Adams BNL October 21, 2004 ARDA Workshop.
Pavel Nevski DDM Workshop BNL, September 27, 2006 JOB DEFINITION as a part of Production.
Open Science Grid Build a Grid Session Siddhartha E.S University of Florida.
OSG Area Coordinator’s Report: Workload Management October 6 th, 2010 Maxim Potekhin BNL
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
David Adams ATLAS ATLAS Distributed Analysis (ADA) David Adams BNL December 5, 2003 ATLAS software workshop CERN.
Site Authorization Service Local Resource Authorization Service (VOX Project) Vijay Sekhri Tanya Levshina Fermilab.
D.Spiga, L.Servoli, L.Faina INFN & University of Perugia CRAB WorkFlow : CRAB: CMS Remote Analysis Builder A CMS specific tool written in python and developed.
David Adams ATLAS ATLAS Distributed Analysis and proposal for ATLAS-LHCb system David Adams BNL March 22, 2004 ATLAS-LHCb-GANGA Meeting.
Future of Distributed Production in US Facilities Kaushik De Univ. of Texas at Arlington US ATLAS Distributed Facility Workshop, Santa Cruz November 13,
The GridPP DIRAC project DIRAC for non-LHC communities.
OSG Area Coordinator’s Report: Workload Management February 9 th, 2011 Maxim Potekhin BNL
David Adams ATLAS ADA: ATLAS Distributed Analysis David Adams BNL December 15, 2003 PPDG Collaboration Meeting LBL.
INFSO-RI Enabling Grids for E-sciencE File Transfer Software and Service SC3 Gavin McCance – JRA1 Data Management Cluster Service.
Breaking the frontiers of the Grid R. Graciani EGI TF 2012.
CMS Experience with the Common Analysis Framework I. Fisk & M. Girone Experience in CMS with the Common Analysis Framework Ian Fisk & Maria Girone 1.
Claudio Grandi INFN Bologna Virtual Pools for Interactive Analysis and Software Development through an Integrated Cloud Environment Claudio Grandi (INFN.
Joe Foster 1 Two questions about datasets: –How do you find datasets with the processes, cuts, conditions you need for your analysis? –How do.
INFSO-RI Enabling Grids for E-sciencE Padova site report Massimo Sgaravatto On behalf of the JRA1 IT-CZ Padova group.
Scientific Data Processing Portal and Heterogeneous Computing Resources at NRC “Kurchatov Institute” V. Aulov, D. Drizhuk, A. Klimentov, R. Mashinistov,
U.S. ATLAS Grid Production Experience
Grid Portal Services IeSE (the Integrated e-Science Environment)
“Running Monte Carlo for the Fermi Telescope using the SLAC farm”
Presentation transcript:

PanDA Integration with the SLAC Pipeline Torre Wenaus, BNL BigPanDA Workshop October 21, 2013

PanDA and LSST/DESC Gave a PanDA overview at the January 2013 LSST Dark Energy Sciences Collaboration (DESC) meeting Decision made to prototype PanDA as (another) submission engine for Tony Johnsons pipeline workload manager used by Fermi and now DESC (among others) – Gain experience in both directions – Gain DESC access to more resources via PanDA – Evaluate PanDAs usefulness In February Tony provided a Java interface to the pipeline for submission engines Torre Wenaus, BNL2 Oct

SLAC Workflow Pipeline and PanDA Prototype PanDA as distributed job manager and gateway to grid, cloud, HPC resources Torre Wenaus, BNL3 Oct

PanDA System Overview Torre Wenaus, BNL4 Oct

PanDA and LSST Pipeline => PanDA submission working, tested with HelloPanda and a standard DESC simulation workflow (phosim, DM stack) – Anyone who can submit a pipeline task can submit to PanDA Initial tests used ATLAS PanDA queue at BNL (epsilon level) and CERN based standalone pilot submission (no APF) – BNL PanDA instance being set up for non-ATLAS use in US – Discussions started with OSG (Miron so far) re: an OSG supported PanDA instance once PanDA is cleanly packaged and (re)ported to MySQL Currently borrowing BNL ATLAS resources during testing phase – Can submit jobs (pilots) as LSST VO as soon as desirable Plenty of US ATLAS resources to be had opportunistically right now; a quiet period (there will regularly be quiet periods like this until 2015) Torre Wenaus, BNL5 Oct

Pipeline/PanDA integration: Pipeline interface Java daemon JobControlClient accepts 3 control commands from the pipeline: submit, status, kill – Submit Receive command and arguments Submit the job Job s the pipeline server on job start and finish – Status Report job status to pipeline server – Kill Kill job PanDA implementation extends the JobControlClient class – Communicates with pipeline server via Java RMI Tested first with an emulator for the pipeline server (provided so a JobControlClient can be tested standalone) then PanDA daemon was connected to the real pipeline service at SLAC Torre Wenaus, BNL6 Oct

Pipeline/PanDA integration PandaJobControlService implemented supporting the submit, status, cancel functions Submit: – Service creates and submits a PanDA job configured with the specified command/arguments Job is configured to invoke a DESC-specific transformation (payload script) Following pipeline conventions, the script s the pipeline server on payload job start/finish. Depends on ing always working from WNs. – Returns PandaID as the ID identifying the job – LSST jobs are picked up from the PanDA job queue by standard PanDA production pilots – Amazon S3 used as staging area for job files – No special provision for data management at present; jobs are on their own in accessing data Torre Wenaus, BNL7 Oct

Pipeline/PanDA integration (2) Status: – Service queries PanDA for status of current pipeline jobs – Programmatic API to the PanDA monitor used to obtain job status (curl query returns json result) – Service maps PanDA job states, etc. onto pipeline conventions and returns pipeline JobStatus Cancel: – Service directs PanDA to cancel the job (request to PanDA server) – The PanDA server immediately sets the job state to cancelled – The job itself will initiate self-destruct when it gets its next heartbeat reply (every 30min) that will contain the cancel command Torre Wenaus, BNL8 Oct

LSST PanDA jobs in pipeline and PanDA monitors (June) Torre Wenaus, BNL9 Oct

Activities Since June Work program established with Open Science Grid (OSG) to move PanDA towards becoming an OSG service – US PanDA instance supporting OSG VOs Predicated on generalizing PanDA for non-ATLAS use: modularity, packaging, documentation, no Oracle requirement: the program of the BigPanDA project – Jarka in particular is working to take up LSST DESC as the first supported BigPanDA VO using the new BigPanDA infrastructure Jarkas work program since June: – Implement MySQL based PanDA server as a tandem (with Oracle) implementation for OSG and other non-ATLAS use (done) – Establish Amazon EC2 based MySQL PanDA service for OSG pre- production and development (done) to be cloned as stable production instance when need arises – Establish a VO-neutral PanDA monitor skeleton operating against the EC2 PanDA service for OSG VO PanDA monitoring (in progress) Oct Torre Wenaus, BNL10

Current Activities With the infrastructure now in place for OSG VO support, the next step and present work-in-progress is the LSST VO implementation PanDA-pipeline daemon (Tonys pipeline interface) moved from a dev EC2 instance to the PanDA OSG VO server Jarka and I working on implementing the pipeline-triggered PanDA job submission on the PanDA OSG VO server BNL now supports the LSST VO; a new PanDA site BNL-LSST created for LSST jobs – Submission to the BNL-LSST site being debugged Oct Torre Wenaus, BNL11

Next Steps Finish setting up the BNL-LSST PanDA site to receive real jobs from the pipeline – Software installed and DESC expert identified to help with simu workload Finish setting up LSST PanDA monitoring alpha version Enlist and work with users to get their workloads up and running – One or two DESCers at BNL are likely first candidates Get LSST user input to tailor LSST PanDA monitoring Extend to support opportunistic OSG (or non-OSG PanDA) sites – ie sites that dont necessarily have a local LSST person helping out – Key issues are software and data availability – For software, set up OSGs Oasis service for CVMFS based distribution – For data access, xrootd a good candidate Integrate the SLAC data catalog – The developer presented last week that it is a few months away from the new VO-neutral REST interface being ready Oct Torre Wenaus, BNL12

Utility of the pipeline/data catalog for BigPanDA? Pipeline provides a user-friendly web based interface to defining potentially very complex DAG workflows, saved and managed as tasks in projects Similar, at least superficially, to what we want from DEFT – Could it be useful? Data catalog similarly user-friendly, supports datasets and folders, dataset and file based metadata, VO- neutral… potentially interesting as a data management offering in association with BigPanDA? Torre Wenaus, BNL13 Oct