Presentation on theme: "LIGO LSC DataGrid Workshop March 24-26, 2005 Livingston Observatory."— Presentation transcript:
LIGO LSC DataGrid Workshop March 24-26, 2005 Livingston Observatory
Part One: Introduction A: Workshop Agenda and Pragmatics B: Defining “the Grid” C: Who’s Who in the Grid World D: Overview of the LSG DataGrid E: Lab 1: Getting Started
A: Workshop Agenda and Pragmatics
Workshop Agenda Thursday, March 24 Introduction Grid Security Data Management Friday, March 25 Job Management Workflow Management MyProxy (Coming Attractions!) Saturday, March 26 Local Presentations
Preparation for the Labs We assume a RedHat 9 installation— Although it’s not impossible that other platforms may work just as well. We’ll assume you’ve installed LSC DataGrid Client Toolkit. We assume your security credentials are already in place.
Bio-Imperatives Food Lunches Dinner Plumbing
Temporal Disclaimer The state of the art is: the art is always changing. Grid infrastructure standards are, however, firming up. For the most part, we’re going to be talking about how things work at the moment. We’ll warn you when we go into Coming Attractions mode.
Who Are Those Guys? GRIDS Center David Gehrig, NCSA-UIUC Mike Freemon, NCSA-UIUC Jaime Frey, University of Wisconson—Madison
B: Defining “the Grid”
“Grid” Buzzword of the year(s). In enterprise computing, different meanings at different times. It often simply means “cluster computing.” In research, it usually means…
Definition: 1998 “A computational grid is a hardware and software infrastructure that provides dependable, consistent, pervasive, and inexpensive access to high-end computational capabilities.” Ian Foster and Carl Kesselman: The Grid: Blueprint for a New Computing Infrastructure
Definition: 2002 “A Grid is a system that coordinates resources that are not subject to centralized control using standard, open, general-purpose protocols and interfaces to deliver nontrivial qualities of service.” Ian Foster, ANL: What is the Grid? A Three-Point Checklist
A Working Definition A distributed computing environment that coordinates Computational jobs Data placement Information management Scales from one computer to thousands Capable of working across many administrative domains
C: Who’s Who in the Grid World
National Middleware Initiative Middleware: an evolving layer of services that resides between the network and more traditional applications for managing security, access, and information exchange Funds GRIDS Center Funds Open Grid Computing Environment
GRIDS Center Grid Research Integration, Deployment, and Support Center Mission: making grid technology deployable and useful outside the development labs Packaging Education
The Globus Alliance Creates core infrastructure services Sponsors include: DARPA, DoE, NSF, NASA e-Science (UK), Vetenskapsrådet (Sweden), KTH (Royal Institute of Technology, Stockholm) IBM, Microsoft Research, Cisco Systems
Globus: Participating Institutions Argonne National Laboratories Information Sciences Institute/USC University of Chicago University of Edinburgh (UK) Center for Parallel Computers (Sweden) “Globus Academic Affiliates”
Globus Toolkit: GT3 Software services and libraries Resource monitoring, discovery, and management Security File management Note! GT4: Expected release sixth quarter of 2004
PyGlobus www-itg.lbl.gov/gtg/projects/pyGlobus/ Lawrence Berkeley National Laboratory An interface to the Globus toolkit using the Python scripting language
Condor A serial/parallel job management system for a pool of compute nodes: job queueing mechanism, scheduling policy, priority scheme, resource monitoring, and resource management. Can be used with Globus Toolkit We’ll use “local Condor” and Condor-G
iVDGL: International Virtual Data Grid Laboratory Goals Deploy a Grid laboratory Use Grid software tools in experiments Support delivery of Grid technologies Education and outreach iVDGL pacman and VDT LSC is an active participant
GriPhyN: Grid Physics Network Coalesced around four experiments Compact Muon Solenoid and ATLAS (“A Toroidal LHC ApparatuS”) at LHC/CERN Laser Interferometer Gravitational-wave Observatory Sloal Digital Sky Survey Petabytes of data annually
VDT: Virtual Data Toolkit Goal: to make it as easy as possible for users to deploy, maintain and use grid middleware Initially developed by GriPhyN and iVDGL Now includes LHC Computing Grid (LCG) and Physics Particle Data Grid (PPDG).
VDT: Components Basic Grid Services Condor, Globus Virtual Data Tools Virtual Data System Utilities Such as GSI-OpenSSH
D: Overview of the LSG DataGrid
What is the LSC DataGrid? A collection of LSC computational and storage resources… … linked through Grid middleware… … into a uniform LSC data analysis environment.
LSC DataGrid Sites Tier 1: CalTech Tier 2: UWM and PSU Tier 3: UT-Brownsville and Salish Kootenai College (SKC) Linux clusters at GEO sites Birmingham, Cardiff and the Albert Einstein Institute (AEI) LDAS instances at Caltech, MIT, PSU, and UWM
For this Workshop LSC DataGrid Sites ldas-grid.ligo.caltech.edu ldas-grid.ligo-wa.caltech.edu ldas-grid.ligo-la.caltech.edu We’ll use ldas-grid.ligo-la.caltech.edu as our head node Full list of LSC DataGrid resources at group.phys.uwm.edu/lscdatagrid/resources More discussion of LSC DataGrid later
E: Lab 1 — Getting Started
Lab 1 — Getting Started This lab will verify: Your software is installed correctly Your sacrifices have pleased the webgod Ping Your security credential (i.e. proxy certificate) is okay Your environment variables won’t suddenly go away
Credits Some slides in this presentation were adapted from presentations from GryPhyN Grid Summer Workshop 2004 The Globus Consortium