Connecting OurGrid & GridSAM A Short Overview. Content Goals OurGrid: architecture overview OurGrid: short overview GridSAM: short overview GridSAM: example.

Slides:



Advertisements
Similar presentations
Building Portals to access Grid Middleware National Technical University of Athens Konstantinos Dolkas, On behalf of Andreas Menychtas.
Advertisements

GridSAM Overview Grid Job S ubmission A nd M onitoring Service What is GridSAM? Funded by the OMII Managed Programme (Started in Sept, 04) Client Perspective.
Current methods for negotiating firewalls for the Condor ® system Bruce Beckles (University of Cambridge Computing Service) Se-Chang Son (University of.
Virtualization and Cloud Computing. Definition Virtualization is the ability to run multiple operating systems on a single physical system and share the.
A Presentation Management System for Collaborative Meetings Krzysztof Wrona (ZEUS) DESY Hamburg 24 March, 2003 ZEUS Electronic Meeting Management System.
MTA SZTAKI Hungarian Academy of Sciences Grid Computing Course Porto, January Introduction to Grid portals Gergely Sipos
Office of Science U.S. Department of Energy Grids and Portals at NERSC Presented by Steve Chan.
Software Frameworks for Acquisition and Control European PhD – 2009 Horácio Fernandes.
Systems Analysis and Design in a Changing World, 6th Edition 1 Chapter 6.
The Open Grid Service Architecture (OGSA) Standard for Grid Computing Prepared by: Haoliang Robin Yu.
Middleware for P2P architecture Jikai Yin, Shuai Zhang, Ziwen Zhang.
Microsoft ® Application Virtualization 4.5 Infrastructure Planning and Design Series.
Overview of the ODP Data Provider Sergey Sukhonosov National Oceanographic Data Centre, Russia Expert training on the Ocean Data Portal technology, Buenos.
EUROPEAN UNION Polish Infrastructure for Supporting Computational Science in the European Research Space Cracow Grid Workshop’10 Kraków, October 11-13,
The SAM-Grid Fabric Services Gabriele Garzoglio (for the SAM-Grid team) Computing Division Fermilab.
 Cloud computing  Workflow  Workflow lifecycle  Workflow design  Workflow tools : xcp, eucalyptus, open nebula.
Raffaele Di Fazio Connecting to the Clouds Cloud Brokers and OCCI.
A Cloud is a type of parallel and distributed system consisting of a collection of inter- connected and virtualized computers that are dynamically provisioned.
M i SMob i S Mob i Store - Mobile i nternet File Storage Platform Chetna Kaur.
Chapter 17 - Deploying Java Applications on the Web1 Chapter 17 Deploying Java Applications on the Web.
Lecture 3: Sun: 16/4/1435 Distributed Computing Technologies and Middleware Lecturer/ Kawther Abas CS- 492 : Distributed system.
Flexibility and user-friendliness of grid portals: the PROGRESS approach Michal Kosiedowski
WP9 Resource Management Current status and plans for future Juliusz Pukacki Krzysztof Kurowski Poznan Supercomputing.
Grid Appliance – On the Design of Self-Organizing, Decentralized Grids David Wolinsky, Arjun Prakash, and Renato Figueiredo ACIS Lab at the University.
GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.
E-science grid facility for Europe and Latin America OurGrid E2GRIS1 Rafael Silva Universidade Federal de Campina.
Grids and Portals for VLAB Marlon Pierce Community Grids Lab Indiana University.
RISICO on the GRID architecture First implementation Mirko D'Andrea, Stefano Dal Pra.
INFSO-RI Enabling Grids for E-sciencE Logging and Bookkeeping and Job Provenance Services Ludek Matyska (CESNET) on behalf of the.
Grid Resource Allocation and Management (GRAM) Execution management Execution management –Deployment, scheduling and monitoring Community Scheduler Framework.
Integrating HPC into the ATLAS Distributed Computing environment Doug Benjamin Duke University.
COMP3019 Coursework: Introduction to GridSAM Steve Crouch School of Electronics and Computer Science.
Guide to Linux Installation and Administration, 2e1 Chapter 2 Planning Your System.
1 Overview of the Application Hosting Environment Stefan Zasada University College London.
1 st December 2003 JIM for CDF 1 JIM and SAMGrid for CDF Mòrag Burgon-Lyon University of Glasgow.
HYDRA: Using Windows Desktop Systems in Distributed Parallel Computing Arvind Gopu, Douglas Grover, David Hart, Richard Repasky, Joseph Rinkovsky, Steve.
Simplifying Resource Sharing in Voluntary Grid Computing with the Grid Appliance David Wolinsky Renato Figueiredo ACIS Lab University of Florida.
London e-Science Centre GridSAM Job Submission and Monitoring Web Service William Lee, Stephen McGough.
The PROGRESS Grid Service Provider Maciej Bogdański Portals & Portlets 2003 Edinburgh, July 14th-17th.
HYDRA: Using Windows Desktop Systems in Distributed Parallel Computing Arvind Gopu, Douglas Grover, David Hart, Richard Repasky, Joseph Rinkovsky, Steve.
Resource Brokering in the PROGRESS Project Juliusz Pukacki Grid Resource Management Workshop, October 2003.
Grid Execution Management for Legacy Code Applications Grid Enabling Legacy Code Applications Tamas Kiss Centre for Parallel.
22 nd September 2003 JIM for CDF 1 JIM and SAMGrid for CDF Mòrag Burgon-Lyon University of Glasgow.
GridSAM - A Standards Based Approach to Job Submission Through Web Services William Lee and Stephen McGough London e-Science Centre Department of Computing,
Privilege separation in Condor Bruce Beckles University of Cambridge Computing Service.
E-science grid facility for Europe and Latin America Bridging the High Performance Computing Gap with OurGrid Francisco Brasileiro Universidade.
 Apache Airavata Architecture Overview Shameera Rathnayaka Graduate Assistant Science Gateways Group Indiana University 07/27/2015.
July 11-15, 2005Lecture3: Grid Job Management1 Grid Compute Resources and Job Management.
Review of Condor,SGE,LSF,PBS
Conference name Company name INFSOM-RI Speaker name The ETICS Job management architecture EGEE ‘08 Istanbul, September 25 th 2008 Valerio Venturi.
Introduction to Grids By: Fetahi Z. Wuhib [CSD2004-Team19]
Uppsala, April 12-16th 2010EGEE 5th User Forum1 A Business-Driven Cloudburst Scheduler for Bag-of-Task Applications Francisco Brasileiro, Ricardo Araújo,
Program Development Cycle
WebFlow High-Level Programming Environment and Visual Authoring Toolkit for HPDC (desktop access to remote resources) Tomasz Haupt Northeast Parallel Architectures.
Aneka Cloud ApplicationPlatform. Introduction Aneka consists of a scalable cloud middleware that can be deployed on top of heterogeneous computing resources.
Component Patterns – Architecture and Applications with EJB copyright © 2001, MATHEMA AG Component Patterns Architecture and Applications with EJB Markus.
AHM04: Sep 2004 Nottingham CCLRC e-Science Centre eMinerals: Environment from the Molecular Level Managing simulation data Lisa Blanshard e- Science Data.
INFSO-RI Enabling Grids for E-sciencE EGEE is a project funded by the European Union under contract IST Job sandboxes.
PROGRESS: GEW'2003 Using Resources of Multiple Grids with the Grid Service Provider Michał Kosiedowski.
T3g software services Outline of the T3g Components R. Yoshida (ANL)
EGEE-II INFSO-RI Enabling Grids for E-sciencE Practical using WMProxy advanced job submission.
Grid Execution Management for Legacy Code Architecture Exposing legacy applications as Grid services: the GEMLCA approach Centre.
Tutorial on Science Gateways, Roma, Catania Science Gateway Framework Motivations, architecture, features Riccardo Rotondo.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI EGI Services for Distributed e-Infrastructure Access Tiziana Ferrari on behalf.
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING CLOUD COMPUTING
Example: Rapid Atmospheric Modeling System, ColoState U
Data Bridge Solving diverse data access in scientific applications
Grid Computing.
Interoperability & Standards
Introduction to Cloud Computing
Presentation transcript:

Connecting OurGrid & GridSAM A Short Overview

Content Goals OurGrid: architecture overview OurGrid: short overview GridSAM: short overview GridSAM: example deployment with Condor Different paradigms: OurGrid Different paradigms: GridSAM Issues: File Staging Issues: many related job submissions OurGrid<>GridSAM connector

Goals To maintain two grid environments in parallel: OurGrid & Condor To handle job submission process through common interface: JSDL, using GridSAM To build connector for GridSAM to talk to OurGrid GridSAM can already talk to Condor through a connector, no problems here

OurGrid: architecture overview

OurGrid: short overview Workers are typically desktop computers that can run jobs directly in their OS or through virtualization (XEN, VMWare, VirtualBox etc.) „Clouds of Workers” are controlled by Peers Jobs are submitted through Brokers Two possibilities here: – Broker can be a dedicated web-site interfacing with specific Peers – Broker can be any machine with MyGrid tool installed that communicates to specified Peers

GridSAM: short overview Web Service-type middleware laying between job submitter and core grid machinery Modular architecture: can talk to many grid infrastructures through specific connectors Collects job submissions sent as XML JSDL files Manages multiple submissions thanks to persistency and monitors submissions lifecycle After accepting JSDLs, re-submits jobs directly to underlying grid machinery as defined in specific connectors

GridSAM: example deployment with Condor Machine (B) runs GridSAM instance in secured OMII container Machine (B) has capability of directly re-submitting jobs to Condor Pool (C) Authorized job submitter (A) can submit jobs over the internet to the GridSAM instance running on (B)

Different paradigms: OurGrid Designed for labs that have access to a pool of desktop machines whose free CPU cycles can be utilized Bag-of-Tasks: jobs are usually disjoint units with independent input and output Data sets often have reasonable enough sizes to be transferred many times across many machines As end-user friendly as possible: asks job submitter only for JDL job submission specification, input files and output files All details of job scheduling and file transfer are hidden from job submitter

Different paradigms: GridSAM Designed primarily for labs utilizing high performance computing (HPC) techniques using few powerful machines HPC is typically used for CPU-demanding computations that uses extensive data sets Every milisecond is important: job specification, input and output files must be handled with minimum human and OS intervention Jobs are often dependent on very large datasets, file transfer should be minimized Data must be accessed in fast and secure way, preferably through URIs which requires minimum external intervention The URIs must be specified directly in JSDL file

Issues: File staging In OurGrid, MyGrid tool takes care of transfer of input files, distributing them according to BoT paradigm, and transfer of output files back to job submitter Also, when submitting through web-site, feedback is sent when output files are available for download Job submitter can just point out files on its own machine, or upload them to some storage server accessible to MyGrid No dedicated storage is needed for MyGrid to work

Issues: File staging GridSAM does not handle input and output files by itself; it delegates this subtask to yet another middleware, Apache VFS VFS was designed to access resources identified by URIs based on fully qualified hostnames and few recognized protocols (FTP/SFTP, HTTP, GridFTP, WebDAV etc.) When submitting JSDL using GridSAM client on particular machine, one cannot just point out local files; they must be uploaded to some dedicated storage space that is identifiable through URI to VFS machinery Only when correctly specified (reliable URIs!) in JSDL, and uploaded to dedicated storage, files may be further processed by GridSAM

Issues: File staging Possible solution 1: define dedicated storage in the form of SFTP/GridFTP file server, accessible both to OurGrid and GridSAM, and write all URIs in JSDL files according to this dedicated storage Possible solution 2: let job submitter decide its own storage mechanisms; accept URI if it is accessible (readable/writable), process the job as usual, let VFS do the rest

Issues: File staging In both cases, security is an important feature to consider JSDL processing is secure enough in GridSAM but secure access to external storage must be maintained separately

Issues: many related job submissions In OurGrid, job submitter can submit JDL job specification with many jobs defined Also, specific environment variables set by OurGrid can be utilized to differentiate between multiple jobs and multiple input/output files No specific support for parameter sweep concept is provided, but job submitter can simulate it by using properly written JDL job specification

Issues: many related job submissions With GridSAM, job submitter is submitting JSDL that contains details for single job only In theory, it is possible to submit multiple JSDLs in short time; they should be internally scheduled using persistency mechanisms by GridSAM, then gradually re-submitted to grid machinery through specified queuing strategy Parameter sweep JSDL extension is currently not supported in GridSAM; in theory, job submitter can submit bunch of JSDLs that simulate it

Issues: many related job submissions Possible solution 1: rely on GridSAM scheduling mechanisms; allow to accept multiple submissions in very short time and let GridSAM re-submit them according to its own strategies Possible solution 2: implement parameter sweep JSDL extension in OurGrid connector or even in GridSAM core module itself Solution 1 is very straightforward; however, the behaviour of GridSAM under those conditions needs to be examined closely Solution 2 is very feasible, but requires much time and resources

OurGrid<>GridSAM connector For OurGrid, MyGrid tool instance (either installed on local machine or as component of job submission web-site) is a single „contact point” for job submitter, hiding all the underlying grid-specific mechanisms The connector should be a wrapper over MyGrid instance