1 Sergio Maffioletti Grid Computing Competence Center GC3 University of Zurich Swiss Grid School 2012 Develop High Throughput.

Slides:



Advertisements
Similar presentations
A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University
Advertisements

GridRPC Sources / Credits: IRISA/IFSIC IRISA/INRIA Thierry Priol et. al papers.
A Computation Management Agent for Multi-Institutional Grids
ProActive Task Manager Component for SEGL Parameter Sweeping Natalia Currle-Linde and Wasseim Alzouabi High Performance Computing Center Stuttgart (HLRS),
Resource Management of Grid Computing
Yevgeny Petrilin Shay Dan Shadi Ibrahim. GUI : Graphical User Interface DAQ :Data Acquisition Data Acquisition device  a self-powered system that communicated.
NextGRID & OGSA Data Architectures: Example Scenarios Stephen Davey, NeSC, UK ISSGC06 Summer School, Ischia, Italy 12 th July 2006.
Software Frameworks for Acquisition and Control European PhD – 2009 Horácio Fernandes.
© 2007 Pearson Education Inc., Upper Saddle River, NJ. All rights reserved.1 Computer Networks and Internets with Internet Applications, 4e By Douglas.
The new The new MONARC Simulation Framework Iosif Legrand  California Institute of Technology.
Interpret Application Specifications
A tool to enable CMS Distributed Analysis
© N. Ganesan, Ph.D., All rights reserved. Active Directory Nanda Ganesan, Ph.D.
Cognizant Reusable Automation Framework for Testing C.R.A.F.T.
DIRAC API DIRAC Project. Overview  DIRAC API  Why APIs are important?  Why advanced users prefer APIs?  How it is done?  What is local mode what.
TIBCO Designer TIBCO BusinessWorks is a scalable, extensible, and easy to use integration platform that allows you to develop, deploy, and run integration.
Configuration Management and Server Administration Mohan Bang Endeca Server.
Connecting OurGrid & GridSAM A Short Overview. Content Goals OurGrid: architecture overview OurGrid: short overview GridSAM: short overview GridSAM: example.
Obsydian OLE Automation Ranjit Sahota Chief Architect Obsydian Development Ranjit Sahota Chief Architect Obsydian Development.
RUP Implementation and Testing
History Server & API Christopher Larrieu Jefferson Laboratory.
Grid Data Management A network of computers forming prototype grids currently operate across Britain and the rest of the world, working on the data challenges.
Towards a Javascript CoG Kit Gregor von Laszewski Fugang Wang Marlon Pierce Gerald Guo
WP9 Resource Management Current status and plans for future Juliusz Pukacki Krzysztof Kurowski Poznan Supercomputing.
9 Chapter Nine Compiled Web Server Programs. 9 Chapter Objectives Learn about Common Gateway Interface (CGI) Create CGI programs that generate dynamic.
K. Harrison CERN, 20th April 2004 AJDL interface and LCG submission - Overview of AJDL - Using AJDL from Python - LCG submission.
INFSO-RI Enabling Grids for E-sciencE Logging and Bookkeeping and Job Provenance Services Ludek Matyska (CESNET) on behalf of the.
1 BIG FARMS AND THE GRID Job Submission and Monitoring issues ATF Meeting, 20/06/03 Sergio Andreozzi.
07/06/11 New Features of WS-PGRADE (and gUSE) 2010 Q Q2 Miklós Kozlovszky MTA SZTAKI LPDS.
The Network Performance Advisor J. W. Ferguson NLANR/DAST & NCSA.
1 Overview of the Application Hosting Environment Stefan Zasada University College London.
Design of a Search Engine for Metadata Search Based on Metalogy Ing-Xiang Chen, Che-Min Chen,and Cheng-Zen Yang Dept. of Computer Engineering and Science.
Application portlets within the PROGRESS HPC Portal Michał Kosiedowski
Shannon Hastings Multiscale Computing Laboratory Department of Biomedical Informatics.
NOVA Networked Object-based EnVironment for Analysis P. Nevski, A. Vaniachine, T. Wenaus NOVA is a project to develop distributed object oriented physics.
November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.
Ganga A quick tutorial Asterios Katsifodimos Trainer, University of Cyprus Nicosia, Feb 16, 2009.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Provenance Challenge gLite Job Provenance.
Tool Integration with Data and Computation Grid GWE - “Grid Wizard Enterprise”
© Geodise Project, University of Southampton, Geodise Middleware & Optimisation Graeme Pound, Hakki Eres, Gang Xue & Matthew Fairman Summer 2003.
Migrating Desktop Bartek Palak Bartek Palak Poznan Supercomputing and Networking Center The Graphical Framework.
Worldwide Lexicon Brian McConnell May, WWL – Brian McConnell Worldwide Lexicon Intro Automatic discovery of dictionary, semantic net and translation.
1 Service Creation, Advertisement and Discovery Including caCORE SDK and ISO21090 William Stephens Operations Manager caGrid Knowledge Center February.
© Geodise Project, University of Southampton, Geodise Middleware Graeme Pound, Gang Xue & Matthew Fairman Summer 2003.
Using the ARCS Grid and Compute Cloud Jim McGovern.
Tool Integration with Data and Computation Grid “Grid Wizard 2”
- GMA Athena (24mar03 - CHEP La Jolla, CA) GMA Instrumentation of the Athena Framework using NetLogger Dan Gunter, Wim Lavrijsen,
An Active Security Infrastructure for Grids Stuart Kenny*, Brian Coghlan Trinity College Dublin.
STAR Scheduling status Gabriele Carcassi 9 September 2002.
CEDPS Services Area Update CEDPS Face-to-Face Meeting ANL October 2007.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Practical using WMProxy advanced job submission.
STAR Scheduler Gabriele Carcassi STAR Collaboration.
D.Spiga, L.Servoli, L.Faina INFN & University of Perugia CRAB WorkFlow : CRAB: CMS Remote Analysis Builder A CMS specific tool written in python and developed.
David Adams ATLAS ATLAS Distributed Analysis and proposal for ATLAS-LHCb system David Adams BNL March 22, 2004 ATLAS-LHCb-GANGA Meeting.
INFSO-RI Enabling Grids for E-sciencE Ganga 4 Technical Overview Jakub T. Moscicki, CERN.
Origami: Scientific Distributed Workflow in McIDAS-V Maciek Smuga-Otto, Bruce Flynn (also Bob Knuteson, Ray Garcia) SSEC.
1 DIRAC Data Management Components A.Tsaregorodtsev, CPPM, Marseille DIRAC review panel meeting, 15 November 2005, CERN.
Topic 4: Distributed Objects Dr. Ayman Srour Faculty of Applied Engineering and Urban Planning University of Palestine.
Practical using C++ WMProxy API advanced job submission
VOs and ARC Florido Paganelli, Lund University
Duncan MacMichael & Galen Deal CSS 534 – Autumn 2016
Introduction to the Application Hosting Environment
Section 13 - Integrating with Third Party Tools
GWE Core Grid Wizard Enterprise (
Open Source distributed document DB for an enterprise
Interoperability & Standards
Ch > 28.4.
Module 01 ETICS Overview ETICS Online Tutorials
Query Processing.
Introduction to the SHIWA Simulation Platform EGI User Forum,
Presentation transcript:

1 Sergio Maffioletti Grid Computing Competence Center GC3 University of Zurich Swiss Grid School 2012 Develop High Throughput solutions using ARC

2 Why development is needed ? Most of the current scientific research is dealing with very large number of data to analyze This is not just parameter4 sweep usecase, but in general Hight Throughput Computing (HTC) Tools are needed to efficiently address both computing and data handling issues Single operation client tools are not always adequate to cope with the high throughput requirements of even a single experiment Dedicated end-to-end tools should be developed to tailor end-user requirements (no general solution that fits for all)

3 Why development is needed ? ARC provides API interfaces that could be used to address the computing and data handling part of any end-to-end solution that aim to leverage grid capabilities ARC1 provides a revised API model centered around plugins (it provides plugings for ARC0, ARC1 and CreamCE compute elements for example) Python bindings turned out to be extremely practical for rapid development of prototype HTC solutions

4 What is the added value instead of my beloved bash/perl scripts ? Using the API allows to directly control and use the data structures ARC provides like a list of computing resources – ExecutionTargets -, or easily create several JobDescription(s) from a template It allows to directly manipulate these data structures and/or integrate them in a control driver script for example, it is very easy to obtain a list of Job objects each of them representing a submitted job) It allows to have a finer grained control on such data structure like optimized bulk submission minimizing the ldapsearches on remote ends It allows to implement own allocation and resubmission strategies

5 A typical high-throughput use case? Run a generic Application on a range of different inputs; where each input is a different file (or a set of files). Then collect output files and post-process them, e.g., gather some statistics. Typically implemented by a set of sh or Perl scripts to drive execution on a local cluster.

6 A programming example From a folder containing 20.inp input files Search for the one that has a particular pattern Each file will be a job submitted Driver script should handle: 1.All preparatory steps (create one JobDescription per input file) 2.Bulk submission 3.Global control on all submitted jobs 4.Result retrieval 5.Check which job found the pattern

7 Basic ARC libraries data structures 1. User configuration 2. Resource discovery and Information retrieval 3. Job submission 4. Job Management

8

9 Resource Discovery and Information Retrieval ComputingServiceRetriever ServiceEndpointRetriever (REGISTRY) TargetInformationRetriever (COMPUTINGINFO)

10 Resource Discovery and Information Retrieval

11 Retrieve ExecutionTargets Retrieve available execution services. The endpoints could be specified either in the UserConfig object or by creating arc.Endpoint objects directly and passed to the ComputingServiceRetriever. The discovery and information retrieval of targets is carried out in parallel threads to speed up the process. If a endpoint is a index service each execution service registered will be queried. List of Execution targets can be accessed by invoking GetExecutionTargets()

12 ComputingServiceRetriever arc.Endpoint("aio.grid.zoo",arc.Endpoint.COMPUTINGINFO) arc.Endpoint("giis.smscg.ch/Mds-Vo- Name=Switzerland,o=grid", arc.Endpoint.REGISTRY)) # Start updating information on endpoints contacting the remote services retriever = arc.ComputingServiceRetriever(uc, endpoints) targets = retriever.GetExecutionTargets()

13 Job Submission

14 Job Submission Job submission starts with the resource discovery and target preparation. Only when a list of possible targets is available the job description is read and brokering method is applied to rank the ExecutionTargets according to the JobDescription's requirements. Note: this allows to submit bulk of jobs without having to re- perform the resource discovery.

15

16 Job Submission The ComputingServiceRetriever has prepared a list of ExecutionTargets. In order to rank the found computing services (ExecutionTargets) the Broker needs detailed knowledge about the job requirements, thus the JobDescription is passed as input to the brokering process. loader = arc.BrokerLoader() broker = loader.load('Random',uc) broker.PrefilterTargets([ExcutionTargets], JobDescription) target = broker.GetBestTarget() target.ComputingService.Name

17 Job Submission target.Submit(arc.UserConfig, JobDescription, arc.Job) Returns True/False and modifies arc.Job object

18 Job Management The main component for handling jobs is JobSupervisor Underneath it uses plugin models to interface with different type of interfaces (like for the Endpoint) JobSupervisor has a internal list of jobs it knows List is updated by JobSupervisor.AddJob(arc.Job) Jobsupervisor may be responsible of jobs running on completely different interfaces, and thus job management should be handled using plugins similar to resource discovery and job submission.

19 Job Management

20 Job Management jobsupervisor = arc.JobSupervisor(usercfg) joblist is extensively used by the JobSupervisor to identify the JobController flavours which are to be loaded. Jobs are managed directly by the Jobsupervisor by filtering the jobs known Jobsupervisor.Update() jobsupervisor.SelectByStatus(["Finished"]) Jobsupervisor.Retrieve(...)

21 Job Management Retrieving Job's sessiondir is done through jobsupervisor arc.Retrieve(download_dir_prefix, usejobname, force, arc.StringList ) arc_sl = arc.StringList() jobsupervisor.Retrieve(download_folder, True, True, arc_sl) True # Job are cleaned on the endpoint jobsupervisor.Clean()

22 Link Main ARC entry point: Discussion mailing list GC3: Grid Computing Competence Center