GRAM: Software Provider Forum Stuart Martin Computational Institute, University of Chicago & Argonne National Lab TeraGrid 2007 Madison, WI.

Slides:



Advertisements
Similar presentations
GT4 GRAM: A Functionality and Performance Study Stuart Martin, Martin Feller Computational Institute, University of Chicago & Argonne National Lab TeraGrid.
Advertisements

TeraGrid's GRAM Auditing & Accounting, & its Integration with the LEAD Science Gateway Stuart Martin Computation Institute, University of Chicago & Argonne.
Scaling TeraGrid Access A Testbed for Attribute-based Authorization and Leveraging Campus Identity Management
Grid Resource Allocation Management (GRAM) GRAM provides the user to access the grid in order to run, terminate and monitor jobs remotely. The job request.
GUMS status Gabriele Carcassi PPDG Common Project 12/9/2004.
CSF4, SGE and Gfarm Integration Zhaohui Ding Jilin University.
Part 7: CondorG A: Condor-G B: Laboratory: CondorG.
Military Technical Academy Bucharest, 2006 GRID SECURITY INFRASTRUCTURE (GSI) - Globus Toolkit - ADINA RIPOSAN Department of Applied Informatics.
Condor-G: A Computation Management Agent for Multi-Institutional Grids James Frey, Todd Tannenbaum, Miron Livny, Ian Foster, Steven Tuecke Reporter: Fu-Jiun.
MTA SZTAKI Hungarian Academy of Sciences Grid Computing Course Porto, January Introduction to Grid portals Gergely Sipos
Seminar Grid Computing ‘05 Hui Li Sep 19, Overview Brief Introduction Presentations Projects Remarks.
Globus Toolkit 4 hands-on Gergely Sipos, Gábor Kecskeméti MTA SZTAKI
The Globus Toolkit Gary Jackson. Introduction The Globus Toolkit is a product of the Globus Alliance ( It is middleware for developing.
Massimo Cafaro GridLab Review GridLab WP10 Information Services Massimo Cafaro CACT/ISUFI University of Lecce, Italy.
1-2.1 Grid computing infrastructure software Brief introduction to Globus © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification.
Data Grids: Globus vs SRB. Maturity SRB  Older code base  Widely accepted across multiple communities  Core components are tightly integrated Globus.
(Remote Access Security) AAA. 2 Authentication User named "flannery" dials into an access server that is configured with CHAP. The access server will.
4b.1 Grid Computing Software Components of Globus 4.0 ITCS 4010 Grid Computing, 2005, UNC-Charlotte, B. Wilkinson, slides 4b.
TeraGrid Science Gateway AAAA Model: Implementation and Lessons Learned Jim Basney NCSA University of Illinois Von Welch Independent.
Globus Computing Infrustructure Software Globus Toolkit 11-2.
Globus 4 Guy Warner NeSC Training.
Kate Keahey Argonne National Laboratory University of Chicago Globus Toolkit® 4: from common Grid protocols to virtualization.
Makrand Siddhabhatti Tata Institute of Fundamental Research Mumbai 17 Aug
Attribute-based Authentication for Gateways Jim Basney Terry Fleury Stuart Martin JP Navarro Tom Scavo Jon Siwek Von Welch Nancy Wilkins-Diehr.
OSG End User Tools Overview OSG Grid school – March 19, 2009 Marco Mambelli - University of Chicago A brief summary about the system.
National Computational Science National Center for Supercomputing Applications National Computational Science MyProxy: An Online Credential Repository.
Ashok Agarwal 1 BaBar MC Production on the Canadian Grid using a Web Services Approach Ashok Agarwal, Ron Desmarais, Ian Gable, Sergey Popov, Sydney Schaffer,
Data Management Kelly Clynes Caitlin Minteer. Agenda Globus Toolkit Basic Data Management Systems Overview of Data Management Data Movement Grid FTP Reliable.
OPEN GRID SERVICES ARCHITECTURE AND GLOBUS TOOLKIT 4
High Performance Louisiana State University - LONI HPC Enablement Workshop – LaTech University,
TeraGrid Science Gateways: Scaling TeraGrid Access Aaron Shelmire¹, Jim Basney², Jim Marsteller¹, Von Welch²,
GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.
ESP workshop, Sept 2003 the Earth System Grid data portal presented by Luca Cinquini (NCAR/SCD/VETS) Acknowledgments: ESG.
Job Submission Condor, Globus, Java CoG Kit Young Suk Moon.
Grid Resource Allocation and Management (GRAM) Execution management Execution management –Deployment, scheduling and monitoring Community Scheduler Framework.
GRAM5 - A sustainable, scalable, reliable GRAM service Stuart Martin - UC/ANL.
Grid and Cloud Computing Dr. Guy Tel-Zur. Today’s agenda UNICORE (see a separate presentation) AWS + Python (Boto) – ideas for projects… Hadoop (see a.
1 All-Hands Meeting 2-4 th Sept 2003 e-Science Centre The Data Portal Glen Drinkwater.
Reliable Data Movement using Globus GridFTP and RFT: New Developments in 2008 John Bresnahan Michael Link Raj Kettimuthu Argonne National Laboratory and.
1 Introduction to Microsoft Windows 2000 Windows 2000 Overview Windows 2000 Architecture Overview Windows 2000 Directory Services Overview Logging On to.
June 6, 2007TeraGrid '071 Clustering the Reliable File Transfer Service Jim Basney and Patrick Duda NCSA, University of Illinois This material is based.
1 Globus Toolkit Security Rachana Ananthakrishnan Frank Siebenlist Argonne National Laboratory.
TeraGrid CTSS Plans and Status Dane Skow for Lee Liming and JP Navarro OSG Consortium Meeting 22 August, 2006.
Communicating Security Assertions over the GridFTP Control Channel Rajkumar Kettimuthu 1,2, Liu Wantao 3,4, Frank Siebenlist 1,2 and Ian Foster 1,2,3 1.
Tutorial: Building Science Gateways TeraGrid 08 Tom Scavo, Jim Basney, Terry Fleury, Von Welch National Center for Supercomputing.
E-science grid facility for Europe and Latin America Bridging the High Performance Computing Gap with OurGrid Francisco Brasileiro Universidade.
Institute For Digital Research and Education Implementation of the UCLA Grid Using the Globus Toolkit Grid Center’s 2005 Community Workshop University.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
Grid Security: Authentication Most Grids rely on a Public Key Infrastructure system for issuing credentials. Users are issued long term public and private.
July 11-15, 2005Lecture3: Grid Job Management1 Grid Compute Resources and Job Management.
Experiences with OGSA-DAI : Portlet Access and Benchmark Deepti Kodeboyina and Beth Plale Computer Science Dept. Indiana University.
Overview of Privilege Project at Fermilab (compilation of multiple talks and documents written by various authors) Tanya Levshina.
Leveraging the InCommon Federation to access the NSF TeraGrid Jim Basney Senior Research Scientist National Center for Supercomputing Applications University.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
VO Privilege Activity. The VO Privilege Project develops and implements fine-grained authorization to grid- enabled resources and services Started Spring.
Biomedical and Bioscience Gateway to National Cyberinfrastructure John McGee Renaissance Computing Institute
Condor Services for the Global Grid: Interoperability between OGSA and Condor Clovis Chapman 1, Paul Wilson 2, Todd Tannenbaum 3, Matthew Farrellee 3,
Data Manipulation with Globus Toolkit Ivan Ivanovski TU München,
File Transfer And Access (FTP, TFTP, NFS). Remote File Access, Transfer and Storage Networks For different goals variety of approaches to remote file.
Status of Globus activities Massimo Sgaravatto INFN Padova for the INFN Globus group
DGAS Distributed Grid Accounting System INFN Workshop /05/1009, Palau Giuseppe Patania Andrea Guarise 6/18/20161.
HTCondor-CE. 2 The Open Science Grid OSG is a consortium of software, service and resource providers and researchers, from universities, national laboratories.
AstroGrid-D Host Monitoring in AstroGrid-D with GRAM-Audit or SGAS based on Usage Records Format S. Braune, F. Breitling, H. Enke AIP.
StoRM: a SRM solution for disk based storage systems
Duncan MacMichael & Galen Deal CSS 534 – Autumn 2016
An Introduction to Cloud Computing
Wide Area Workload Management Work Package DATAGRID project
A Grid Authorization Model for Science Gateways
Condor-G: An Update.
Presentation transcript:

GRAM: Software Provider Forum Stuart Martin Computational Institute, University of Chicago & Argonne National Lab TeraGrid 2007 Madison, WI

2 GRAM - Basic Job Submission and Control Service l A uniform service interface for remote job submission and control –Includes file staging and I/O management –Includes reliability features –Supports basic Grid security mechanisms –Asynchronous monitoring –Interfaces with local resource managers, simplifies the job of metaschedulers/brokers l GRAM is not a scheduler. –No scheduling –No metascheduling/brokering

3 GRAM Versions in GT4 l GRAM2 (Pre-WS GRAM) –Proprietary Protocol based implementation –Gatekeeper and Job Manager l GRAM4 (WS GRAM) –Web Services-based implementation –Managed Job Factory Service (MJFS) –Managed Executable Job Service (MEJS)

4 Performance Comparisons

5 Concurrent Jobs (as in paper) Stage In Stage Out File Clean Up Unique Job Dir GRAM2GRAM4 None No X10KB No X10KB Yes Average seconds per 1000 jobs Condor-g to GRAM to Condor LRM

6 Concurrent Jobs (as will be in GT 4.0.5) Stage In Stage Out File Clean Up Unique Job Dir GRAM2GRAM4 None No X10KB No X10KB Yes Average seconds per 1000 jobs Condor-g to GRAM to Condor LRM

7 Improving performance for staging jobs l Adding local method call mechanism for general use in Java WS Core (4.0.5) –GRAM is doing this with RFT –Any service which calls another in-process service could make similar modifications for local calls and likely benefit from improved performance l Adding caching of the GridFTP server connections in RFT (4.0.6)

8 Sequential Jobs Delegation Stage In Stage Out GRAM2GRAM4 None N/A1.70 Per JobNone Per Job1X10KBNone Shared1X10KBNoneN/A5.41 Per Job1X10KB Shared1X10KB N/A7.91 Average seconds per job (Fork)

9 Sequential Jobs Delegation Stage In Stage Out GRAM2GRAM4 None N/A1.46 Per JobNone Per Job1X10KBNone Shared1X10KBNoneN/A3.51 Per Job1X10KB Shared1X10KB N/A3.67 Average seconds per job (Fork)

10 GRAM Auditing

11 TG Gateways l Lower the barrier for scientists and their applications to use TeraGrid resources l Provide an application or domain-specific interface that a scientist can easily understand l Each gateway may have 100s or 1000s of users accessing TG resources l Must be efficient and scale

12 Use Cases l Group Access –For efficiency, a “community” credential is used to multiplex many users over a single ID l Query Job Accounting –Gateways need a remote interface to obtain the TG units charged for their user’s jobs l Auditing –Grid services provide access to resources –TG Resource Providers need a record of actions performed by services

13 Requirements From Use Cases l Grid Job Identifier l Remote client interface to auditing and accounting information l Creation of service audit and accounting information l Access to remote LRM accounting information from the audit service l Scalability in storing information/records l Secure access (authentication and authorization) to audit and accounting information

14 Grid Job Identifier l Uniquely identifies a job l Shared between the client (Gateway) and service (TG RP) l Obtained in the normal service interaction/protocol l In GRAM4 it’s the EPR converted l In GRAM2 it’s the job contact (as is) l GRAM4 Example >>>

15 GRAM4 EPR: <ns4:ReferenceParameters xmlns:ns4=" Grid Job ID: QDzjbFVYImtVg8

16 Remote Client Interface l Flexible query interface to retrieve audit and accounting records l Define an operation “getChargeForJob” to return the units consumed by a Grid Job ID l Keep audit service interface separate from GRAM service to allow flexible deployment scenarios –Allow a single audit service for multiple GRAM services –Same client interface could be used for other services, for example, charging for data storage or transfers l OGSA-DAI satisfies these requirements

17 Creation of Service Auditing Information l Added GRAM audit record creation upon job termination –Record fields: Job_grid_id, local_job_id, submission_job_id, subject_name, username, creation_time, queued_time, stage_in_gid, stage_out_gid, clean_up_gid, gt_verison, rm_type, job_description, success_flag –Gerson Galang (APAC) contribution for GRAM4 audit record creation at beginning of job, update after LRM submission, and final update upon termination –Records are needed soon after job termination l Accounting information is created by the local resource managers

18 Access to LRM Accounting Information l TeraGrid uploads all LRM accounting information from each TG site to a central DB (TGCDB) l The OGSA-DAI service can be configured to access the remote TGCDB

19 Scalability in Storing Information/Records l Estimated that system should handle 100,000+ records l GRAM service inserts records directly into audit DB l Audit DB must be local to GRAM service to assure reliability l Implemented to use either postgress or MySQL

20 Secure access l Standard authentication and authorization methods should be used to limit access to the audit and accounting information –Clients must present a valid X.509 certificate –Access can be controlled based on a range of policies l Current policy is to allow access iff the DN of the requestor matches the DN in the audit record

21 GT4 Java Container Delegation Resource Manager RFT RM Accounting LEAD Gateway Resource Provider Site TG Central Accounting DB RFT Audit Table GRAM Audit Table AMIE OGSA DAI WS GRAM 1, Compute Cluster

22 Sequence Description 1. Gateway submits job and gets an EPR on the reply 2. Gateway controls and monitors job with EPR 3. GRAM submits and monitors job in RM 4. GRAM inserts audit record at end of job 5. RM writes job accounting record 6. AMIE uploads RM accounting records to TGCDB. The RM accounting record is converted to TG accounting units. 7. Gateway locally converts EPR to GJID 8. Gateway calls OGSA-DAI getChargeForJob with GJID and gets the job usage on the reply 9. OGSA-DAI processes remote join between GRAM audit and TGCDB