Virtual Batch Queues A Service Oriented View of “The Fabric” Rich Baker Brookhaven National Laboratory April 4, 2002.

Slides:



Advertisements
Similar presentations
Data Management Expert Panel - WP2. WP2 Overview.
Advertisements

Condor-G: A Computation Management Agent for Multi-Institutional Grids James Frey, Todd Tannenbaum, Miron Livny, Ian Foster, Steven Tuecke Reporter: Fu-Jiun.
WP 1 Grid Workload Management Massimo Sgaravatto INFN Padova.
A conceptual model of grid resources and services Authors: Sergio Andreozzi Massimo Sgaravatto Cristina Vistoli Presenter: Sergio Andreozzi INFN-CNAF Bologna.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
Workload Management Workpackage Massimo Sgaravatto INFN Padova.
A Model for Grid User Management Rich Baker Dantong Yu Tomasz Wlodek Brookhaven National Lab.
Operating Systems Concepts 1. A Computer Model An operating system has to deal with the fact that a computer is made up of a CPU, random access memory.
DIRAC API DIRAC Project. Overview  DIRAC API  Why APIs are important?  Why advanced users prefer APIs?  How it is done?  What is local mode what.
The SAM-Grid Fabric Services Gabriele Garzoglio (for the SAM-Grid team) Computing Division Fermilab.
Chapter 3.1:Operating Systems Concepts 1. A Computer Model An operating system has to deal with the fact that a computer is made up of a CPU, random access.
GRID job tracking and monitoring Dmitry Rogozin Laboratory of Particle Physics, JINR 07/08/ /09/2006.
Grappa: Grid access portal for physics applications Shava Smallen Extreme! Computing Laboratory Department of Physics Indiana University.
Connecting OurGrid & GridSAM A Short Overview. Content Goals OurGrid: architecture overview OurGrid: short overview GridSAM: short overview GridSAM: example.
Workload Management WP Status and next steps Massimo Sgaravatto INFN Padova.
HEP Experiment Integration within GriPhyN/PPDG/iVDGL Rick Cavanaugh University of Florida DataTAG/WP4 Meeting 23 May, 2002.
OSG Site Provide one or more of the following capabilities: – access to local computational resources using a batch queue – interactive access to local.
ANSTO E-Science workshop Romain Quilici University of Sydney CIMA CIMA Instrument Remote Control Instrument Remote Control Integration with GridSphere.
K. Harrison CERN, 20th April 2004 AJDL interface and LCG submission - Overview of AJDL - Using AJDL from Python - LCG submission.
Chapter 2: Operating-System Structures. 2.2 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Chapter 2: Operating-System Structures Operating.
Frascati, October 9th, Accounting in DataGrid Initial Architecture Albert Werbrouck Frascati, October 9, 2001.
The Data Grid: Towards an Architecture for the Distributed Management and Analysis of Large Scientific Dataset Caitlin Minteer & Kelly Clynes.
Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.
K. De UTA Grid Workshop April 2002 U.S. ATLAS Grid Testbed Workshop at UTA Introduction and Goals Kaushik De University of Texas at Arlington.
China Grid Activity on SIG Presented by Guoqing Li At WGISS-21, Budapest 8 May, 2006.
Tier 1 Facility Status and Current Activities Rich Baker Brookhaven National Laboratory NSF/DOE Review of ATLAS Computing June 20, 2002.
Grid Workload Management Massimo Sgaravatto INFN Padova.
- Distributed Analysis (07may02 - USA Grid SW BNL) Distributed Processing Craig E. Tull HCG/NERSC/LBNL (US) ATLAS Grid Software.
Web Services BOF This is a proposed new working group coming out of the Grid Computing Environments Research Group, as an outgrowth of their investigations.
DIRAC Review (13 th December 2005)Stuart K. Paterson1 DIRAC Review Exposing DIRAC Functionality.
Resource Brokering in the PROGRESS Project Juliusz Pukacki Grid Resource Management Workshop, October 2003.
Developing & Managing A Large Linux Farm – The Brookhaven Experience CHEP2004 – Interlaken September 27, 2004 Tomasz Wlodek - BNL.
Author - Title- Date - n° 1 Partner Logo EU DataGrid, Work Package 5 The Storage Element.
Ames Research CenterDivision 1 Information Power Grid (IPG) Overview Anthony Lisotta Computer Sciences Corporation NASA Ames May 2,
The SAM-Grid / LCG Interoperability Test Bed Gabriele Garzoglio ( ) Speaker: Pierre Girard (
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
9 Systems Analysis and Design in a Changing World, Fourth Edition.
Grid User Interface for ATLAS & LHCb A more recent UK mini production used input data stored on RAL’s tape server, the requirements in JDL and the IC Resource.
CASTOR evolution Presentation to HEPiX 2003, Vancouver 20/10/2003 Jean-Damien Durand, CERN-IT.
US LHC OSG Technology Roadmap May 4-5th, 2005 Welcome. Thank you to Deirdre for the arrangements.
1 Andrea Sciabà CERN Critical Services and Monitoring - CMS Andrea Sciabà WLCG Service Reliability Workshop 26 – 30 November, 2007.
Introduction to Grids By: Fetahi Z. Wuhib [CSD2004-Team19]
ATLAS Grid Requirements A First Draft Rich Baker Brookhaven National Laboratory.
AliEn AliEn at OSC The ALICE distributed computing environment by Bjørn S. Nilsen The Ohio State University.
Architecture View Models A model is a complete, simplified description of a system from a particular perspective or viewpoint. There is no single view.
Globus and PlanetLab Resource Management Solutions Compared M. Ripeanu, M. Bowman, J. Chase, I. Foster, M. Milenkovic Presented by Dionysis Logothetis.
DataTAG is a project funded by the European Union DataTAG WP4 meeting, Bologna 29/07/2003 – n o 1 GLUE Schema - Status Report DataTAG WP4 meeting Bologna,
6 march Building the INFN Grid Proposal outline a.ghiselli,l.luminari,m.sgaravatto,c.vistoli INFN Grid meeting, milano.
David Foster LCG Project 12-March-02 Fabric Automation The Challenge of LHC Scale Fabrics LHC Computing Grid Workshop David Foster 12 th March 2002.
ATLAS-specific functionality in Ganga - Requirements for distributed analysis - ATLAS considerations - DIAL submission from Ganga - Graphical interfaces.
File Transfer And Access (FTP, TFTP, NFS). Remote File Access, Transfer and Storage Networks For different goals variety of approaches to remote file.
Status of Globus activities Massimo Sgaravatto INFN Padova for the INFN Globus group
Towards deploying a production interoperable Grid Infrastructure in the U.S. Vicky White U.S. Representative to GDB.
Claudio Grandi INFN Bologna Virtual Pools for Interactive Analysis and Software Development through an Integrated Cloud Environment Claudio Grandi (INFN.
Monitoring and Information Services Core Infrastructure (MIS-CI) Service Description Mark L. Green OSG Integration Workshop at UC Feb 15-17, 2005.
G. Russo, D. Del Prete, S. Pardi Kick Off Meeting - Isola d'Elba, 2011 May 29th–June 01th A proposal for distributed computing monitoring for SuperB G.
Grid Deployment Technical Working Groups: Middleware selection AAA,security Resource scheduling Operations User Support GDB Grid Deployment Resource planning,
HTCondor-CE. 2 The Open Science Grid OSG is a consortium of software, service and resource providers and researchers, from universities, national laboratories.
David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL May 19, 2003 BNL Technology Meeting.
Interface of “The Grid” to “The Fabric” Rich Baker Brookhaven National Lab.
DIRAC: Workload Management System Garonne Vincent, Tsaregorodtsev Andrei, Centre de Physique des Particules de Marseille Stockes-rees Ian, University of.
Massimo Sgaravatto INFN Padova
U.S. ATLAS Grid Production Experience
Chapter 2: System Structures
SAM at CCIN2P3 configuration issues
LQCD Computing Operations
Software Defined Networking (SDN)
Wide Area Workload Management Work Package DATAGRID project
Presentation transcript:

Virtual Batch Queues A Service Oriented View of “The Fabric” Rich Baker Brookhaven National Laboratory April 4, 2002

4 April, 2002 R. Baker US ATLAS Grid Testbed Workshop 2 Fabrics Session of LCG Launch Workshop  Strict Uniformity is Impossible  Multiple Implementations Will Exist Even Within a Single Site  Different Economics Drive Different Choices at Different Sites  Expose Services, Not Facilities  Users Should Expect Uniform Interfaces to Services  Define Boundaries  Site Can’t be a Black Box  Internal View May Vary From Site to Site

4 April, 2002 R. Baker US ATLAS Grid Testbed Workshop 3 LHC/iVDGL Facilities Workshop  Prototype Batch Queues to be Implemented  BNL, FNAL, UCSD, JHU  ATLAS, CMS, SDSS  First (Trivial) Implementation – Fully Preconfigured  Queue is Described only by Name – Advertise via MDS  Requires User Pre-Knowledge of Queue Details  Evolve Towards More Abstract Implementation  Advertise Enough Information to Fully Describe Queue  Requires No User Pre-Knowledge

4 April, 2002 R. Baker US ATLAS Grid Testbed Workshop 4 Job Manager’s View of Computing Element T … Computing Elements, distributed in possible different administrative domains, can be very different and can rely on different mechanisms, policies, implementations: they can be different in hardware, they can run different operating systems, they can be managed by different local resource management systems, they can use different authentication and authorization mechanisms, etc… T These issues will be addressed relying on standard protocols: “forcing” the Computing Elements to use standard protocols...  (From EDG JSS Architecture and APIs document, July 2001)

4 April, 2002 R. Baker US ATLAS Grid Testbed Workshop 5 Various Views of a Compute Element  Pre-Grid Paradigm:  User Aware of All Local Resources  Jobs Can (Must) Use Local Configuration/Resources  Condor Standard Universe  Just a CPU – Local Resources Irrelevant  Jobs Can Not Use Local Resources – Inefficiency  Virtual Batch Queue  Advertise CPU Plus Local Resources  Jobs Can Take Advantage – Improved Efficiency

4 April, 2002 R. Baker US ATLAS Grid Testbed Workshop 6 Some Thoughts  Local Administration of Hardware  Remote Job Manager Can Not Reinstall OS  Local Monitoring and Security Must Be Respected  Must Advertise Enough Information for Job Manager to Determine Suitability  Unchangeable Configuration (OS, etc.)  Licensed Products  Minimum Scratch Space Available  Access Methods for Local Storage

4 April, 2002 R. Baker US ATLAS Grid Testbed Workshop 7 Additional Considerations  Typical Job Sets Dozens of Environment Variables  All of these Must be Abstracted and Discoverable  Some Can be Discovered At Job Initiation  Input and Output “Sandboxes” Are Local Directories  Setting “PATH” Requires Information  What Defines a Single “VBQ”?  Same Unchangeable Environment  Same View of Local (Non-WAN) Storage  APIs for Interactions Between Job and Remote Manager  APIs for Interactions Between Job and Local Manager

4 April, 2002 R. Baker US ATLAS Grid Testbed Workshop 8 For Example  Site May Have Two Different libC Versions  Virtual Queue 1 Advertises libC-x  Set Path to Use /usr/libC-x directory  Virtual Queue 2 Advertises libC-y  Set Path to Use /usr/libC-y directory  Job Manager “Knows” Which Version User Needs  If x or y, Use Local Installation  If libC-z, no problem! Bring it with you and set path

4 April, 2002 R. Baker US ATLAS Grid Testbed Workshop 9 “The Big Picture”  Fully Integrated Compile Through Results  User Builds Application – Dependencies Tracked (CMT)  Simple User I/F With Portal (Grappa)  Job Manager Learns Job Dependencies  Available VBQs Discovered – “Best” Match Selected  User Environment Deployed (PacMan)  Abstract Job Parameters Mapped to Local Reality  Job Interactions with Local and Remote Managers  Error Handling  Local Clean Up

4 April, 2002 R. Baker US ATLAS Grid Testbed Workshop 10 Immediate Work for US ATLAS  Develop AFS Free Run-Time Environment  Deploy and Test at US ATLAS Test Bed Sites  Use Trivial Implementation of Queue Description  Use FNAL, UCSD and JHU for Proof of Portability  Start to Define/Enumerate Details  What Info is Needed to “Fully” Describe a Queue?  How to Take Advantage of Local Resources?