LIGO LSC DataGrid Workshop March 24-26, 2005 Livingston Observatory.

3 Part One: Introduction A: Workshop Agenda and Pragmatics B: Defining “the Grid” C: Who’s Who in the Grid World D: Overview of the LSG DataGrid E: Lab 1: Getting Started

4 A: Workshop Agenda and Pragmatics

5 Workshop Agenda Thursday, March 24 Introduction Grid Security Data Management Friday, March 25 Job Management Workflow Management MyProxy (Coming Attractions!) Saturday, March 26 Local Presentations

6 Preparation for the Labs We assume a RedHat 9 installation— Although it’s not impossible that other platforms may work just as well. We’ll assume you’ve installed LSC DataGrid Client Toolkit. We assume your security credentials are already in place.

7 Bio-Imperatives Food Lunches Dinner Plumbing

8 Temporal Disclaimer The state of the art is: the art is always changing. Grid infrastructure standards are, however, firming up. For the most part, we’re going to be talking about how things work at the moment. We’ll warn you when we go into Coming Attractions mode.

9 Who Are Those Guys? GRIDS Center David Gehrig, NCSA-UIUC Mike Freemon, NCSA-UIUC Jaime Frey, University of Wisconson—Madison

10 Now, everybody—

11 B: Defining “the Grid”

12 “Grid” Buzzword of the year(s). In enterprise computing, different meanings at different times. It often simply means “cluster computing.” In research, it usually means…

13 Definition: 1998 “A computational grid is a hardware and software infrastructure that provides dependable, consistent, pervasive, and inexpensive access to high-end computational capabilities.” Ian Foster and Carl Kesselman: The Grid: Blueprint for a New Computing Infrastructure

14 Definition: 2002 “A Grid is a system that coordinates resources that are not subject to centralized control using standard, open, general-purpose protocols and interfaces to deliver nontrivial qualities of service.” Ian Foster, ANL: What is the Grid? A Three-Point Checklist

15 A Working Definition A distributed computing environment that coordinates Computational jobs Data placement Information management Scales from one computer to thousands Capable of working across many administrative domains

16 C: Who’s Who in the Grid World

17 National Middleware Initiative Middleware: an evolving layer of services that resides between the network and more traditional applications for managing security, access, and information exchange Funds GRIDS Center Funds Open Grid Computing Environment

18 GRIDS Center Grid Research Integration, Deployment, and Support Center Mission: making grid technology deployable and useful outside the development labs Packaging Education

19 The Globus Alliance Creates core infrastructure services Sponsors include: DARPA, DoE, NSF, NASA e-Science (UK), Vetenskapsrådet (Sweden), KTH (Royal Institute of Technology, Stockholm) IBM, Microsoft Research, Cisco Systems

20 Globus: Participating Institutions Argonne National Laboratories Information Sciences Institute/USC University of Chicago University of Edinburgh (UK) Center for Parallel Computers (Sweden) “Globus Academic Affiliates”

21 Globus Toolkit: GT3 Software services and libraries Resource monitoring, discovery, and management Security File management Note! GT4: Expected release sixth quarter of 2004

22 PyGlobus Lawrence Berkeley National Laboratory An interface to the Globus toolkit using the Python scripting language

23 Condor A serial/parallel job management system for a pool of compute nodes: job queueing mechanism, scheduling policy, priority scheme, resource monitoring, and resource management. Can be used with Globus Toolkit We’ll use “local Condor” and Condor-G

24 iVDGL: International Virtual Data Grid Laboratory Goals Deploy a Grid laboratory Use Grid software tools in experiments Support delivery of Grid technologies Education and outreach iVDGL pacman and VDT LSC is an active participant

25 GriPhyN: Grid Physics Network Coalesced around four experiments Compact Muon Solenoid and ATLAS (“A Toroidal LHC ApparatuS”) at LHC/CERN Laser Interferometer Gravitational-wave Observatory Sloal Digital Sky Survey Petabytes of data annually

26 VDT: Virtual Data Toolkit Goal: to make it as easy as possible for users to deploy, maintain and use grid middleware Initially developed by GriPhyN and iVDGL Now includes LHC Computing Grid (LCG) and Physics Particle Data Grid (PPDG).

27 VDT: Components Basic Grid Services Condor, Globus Virtual Data Tools Virtual Data System Utilities Such as GSI-OpenSSH

28 D: Overview of the LSG DataGrid

29 What is the LSC DataGrid? A collection of LSC computational and storage resources… … linked through Grid middleware… … into a uniform LSC data analysis environment.

30 LSC DataGrid Sites Tier 1: CalTech Tier 2: UWM and PSU Tier 3: UT-Brownsville and Salish Kootenai College (SKC) Linux clusters at GEO sites Birmingham, Cardiff and the Albert Einstein Institute (AEI) LDAS instances at Caltech, MIT, PSU, and UWM

31 For this Workshop LSC DataGrid Sites We’ll use as our head node Full list of LSC DataGrid resources at More discussion of LSC DataGrid later

32 E: Lab 1 — Getting Started

33 Lab 1 — Getting Started This lab will verify: Your software is installed correctly Your sacrifices have pleased the webgod Ping Your security credential (i.e. proxy certificate) is okay Your environment variables won’t suddenly go away

34 Credits Some slides in this presentation were adapted from presentations from GryPhyN Grid Summer Workshop 2004 The Globus Consortium

