Introduction to Grid Computing Slides adapted from Midwest Grid School Workshop 2008 ( https://twiki.grid.iu.edu/bin/view/Education/MWGS08)

Slides:



Advertisements
Similar presentations
International Grid Communities Dr. Carl Kesselman Information Sciences Institute University of Southern California.
Advertisements

CSF4 Meta-Scheduler Tutorial 1st PRAGMA Institute Zhaohui Ding or
The Anatomy of the Grid: An Integrated View of Grid Architecture Carl Kesselman USC/Information Sciences Institute Ian Foster, Steve Tuecke Argonne National.
FP7-INFRA Enabling Grids for E-sciencE EGEE Induction Grid training for users, Institute of Physics Belgrade, Serbia Sep. 19, 2008.
High Performance Computing Course Notes Grid Computing.
Data Grids Darshan R. Kapadia Gregor von Laszewski
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
USING THE GLOBUS TOOLKIT This summary by: Asad Samar / CALTECH/CMS Ben Segal / CERN-IT FULL INFO AT:
Parallel Programming on the SGI Origin2000 With thanks to Moshe Goldberg, TCC and Igor Zacharov SGI Taub Computer Center Technion Mar 2005 Anne Weill-Zrahia.
Other servers Java client, ROOT (analysis tool), IGUANA (CMS viz. tool), ROOT-CAVES client (analysis sharing tool), … any app that can make XML-RPC/SOAP.
DataGrid Kimmo Soikkeli Ilkka Sormunen. What is DataGrid? DataGrid is a project that aims to enable access to geographically distributed computing power.
Milos Kobliha Alejandro Cimadevilla Luis de Alba Parallel Computing Seminar GROUP 12.
The new The new MONARC Simulation Framework Iosif Legrand  California Institute of Technology.
Grids and Globus at BNL Presented by John Scott Leita.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Globus 4 Guy Warner NeSC Training.
The SAM-Grid Fabric Services Gabriele Garzoglio (for the SAM-Grid team) Computing Division Fermilab.
QCDgrid Technology James Perry, George Beckett, Lorna Smith EPCC, The University Of Edinburgh.
CONDOR DAGMan and Pegasus Selim Kalayci Florida International University 07/28/2009 Note: Slides are compiled from various TeraGrid Documentations.
OSG End User Tools Overview OSG Grid school – March 19, 2009 Marco Mambelli - University of Chicago A brief summary about the system.
Data Management Kelly Clynes Caitlin Minteer. Agenda Globus Toolkit Basic Data Management Systems Overview of Data Management Data Movement Grid FTP Reliable.
National Center for Supercomputing Applications The Computational Chemistry Grid: Production Cyberinfrastructure for Computational Chemistry PI: John Connolly.
High Performance Louisiana State University - LONI HPC Enablement Workshop – LaTech University,
DISTRIBUTED COMPUTING
ARGONNE  CHICAGO Ian Foster Discussion Points l Maintaining the right balance between research and development l Maintaining focus vs. accepting broader.
GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.
The Data Grid: Towards an Architecture for the Distributed Management and Analysis of Large Scientific Dataset Caitlin Minteer & Kelly Clynes.
Grids and Portals for VLAB Marlon Pierce Community Grids Lab Indiana University.
Miguel Branco CERN/University of Southampton Enabling provenance on large-scale e-Science applications.
Grid Resource Allocation and Management (GRAM) Execution management Execution management –Deployment, scheduling and monitoring Community Scheduler Framework.
Through the development of advanced middleware, Grid computing has evolved to a mature technology in which scientists and researchers can leverage to gain.
Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.
Grid Technologies  Slide text. What is Grid?  The World Wide Web provides seamless access to information that is stored in many millions of different.
CSF4 Meta-Scheduler Name: Zhaohui Ding, Xiaohui Wei
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES Data Replication Service Sandeep Chandra GEON Systems Group San Diego Supercomputer Center.
Introduction to Grid Computing Midwest Grid Workshop Module 1 1.
10/24/2015OSG at CANS1 Open Science Grid Ruth Pordes Fermilab
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.
Tool Integration with Data and Computation Grid GWE - “Grid Wizard Enterprise”
Grid Architecture William E. Johnston Lawrence Berkeley National Lab and NASA Ames Research Center (These slides are available at grid.lbl.gov/~wej/Grids)
Introduction to Grid Computing Grid School Workshop – Module 1 1.
Tools for collaboration How to share your duck tales…
June 24-25, 2008 Regional Grid Training, University of Belgrade, Serbia Introduction to gLite gLite Basic Services Antun Balaž SCL, Institute of Physics.
Ames Research CenterDivision 1 Information Power Grid (IPG) Overview Anthony Lisotta Computer Sciences Corporation NASA Ames May 2,
Institute For Digital Research and Education Implementation of the UCLA Grid Using the Globus Toolkit Grid Center’s 2005 Community Workshop University.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
July 11-15, 2005Lecture3: Grid Job Management1 Grid Compute Resources and Job Management.
What is SAM-Grid? Job Handling Data Handling Monitoring and Information.
Introduction to Grid Computing CI-Days workshop Clemson University, Clemson, SC May 19,
GRIDS Center Middleware Overview Sandra Redman Information Technology and Systems Center and Information Technology Research Center National Space Science.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
US LHC OSG Technology Roadmap May 4-5th, 2005 Welcome. Thank you to Deirdre for the arrangements.
TeraGrid Gateway User Concept – Supporting Users V. E. Lynch, M. L. Chen, J. W. Cobb, J. A. Kohl, S. D. Miller, S. S. Vazhkudai Oak Ridge National Laboratory.
Introduction to Grid Computing Grid School Workshop – Module 1 1.
Securing the Grid & other Middleware Challenges Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.
Super Computing 2000 DOE SCIENCE ON THE GRID Storage Resource Management For the Earth Science Grid Scientific Data Management Research Group NERSC, LBNL.
Tool Integration with Data and Computation Grid “Grid Wizard 2”
© Copyright AARNet Pty Ltd PRAGMA Update & some personal observations James Sankar Network Engineer - Middleware.
Distributed Physics Analysis Past, Present, and Future Kaushik De University of Texas at Arlington (ATLAS & D0 Collaborations) ICHEP’06, Moscow July 29,
Storage Management on the Grid Alasdair Earl University of Edinburgh.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
INTRODUCTION TO XSEDE. INTRODUCTION  Extreme Science and Engineering Discovery Environment (XSEDE)  “most advanced, powerful, and robust collection.
Introduction to Grids Ben Clifford University of Chicago
] Open Science Grid Ben Clifford University of Chicago
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI EGI solution for high throughput data analysis Peter Solagna EGI.eu Operations.
Peter Kacsuk – Sipos Gergely MTA SZTAKI
Grid Computing.
CS258 Spring 2002 Mark Whitney and Yitao Duan
Presentation transcript:

Introduction to Grid Computing Slides adapted from Midwest Grid School Workshop 2008 (

Overview Supercomputers, Clusters, and Grids Motivations and Applications Grid Middlewares and Architectures Job Management, Data Management Grid Security National Grids Grid Workflow 2

Computing “Clusters” are today’s Supercomputers Cluster Management “frontend” Tape Backup robots I/O Servers typically RAID fileserver Disk Arrays Lots of Worker Nodes A few Headnodes, gatekeepers and other service nodes 3

Cluster Architecture Cluster User 5 Head Node(s) Login access (ssh) Cluster Scheduler (PBS, Condor, SGE) Web Service (http) Remote File Access (scp, FTP etc) Node 0 … … … Node N Shared Cluster Filesystem Storage (applications and data) Job execution requests & status Compute Nodes (10 to 10,000 PC’s with local disks) … Cluster User Internet Protocols

Scaling up Science: Citation Network Analysis in Sociology Work of James Evans, University of Chicago, Department of Sociology 6

Scaling up the analysis Query and analysis of 25+ million citations Work started on desktop workstations Queries grew to month-long duration With data distributed across U of Chicago TeraPort cluster:  50 (faster) CPUs gave 100 X speedup  Many more methods and hypotheses can be tested! Higher throughput and capacity enables deeper analysis and broader community access. 7

Grids consist of distributed clusters Grid Client Application & User Interface Grid Client Middleware Resource, Workflow & Data Catalogs 8 Grid Site 2: Sao Paolo Grid Service Middleware Compute Cluster Grid Storage Grid Protocols Grid Site 1: Fermilab Grid Service Middleware Compute Cluster Grid Storage …Grid Site N: UWisconsin Grid Service Middleware Compute Cluster Grid Storage

Initial Grid driver: High Energy Physics Tier2 Centre ~1 TIPS Online System Offline Processor Farm ~20 TIPS CERN Computer Centre FermiLab ~4 TIPS France Regional Centre Italy Regional Centre Germany Regional Centre Institute Institute ~0.25TIPS Physicist workstations ~100 MBytes/sec ~622 Mbits/sec ~1 MBytes/sec There is a “bunch crossing” every 25 nsecs. There are 100 “triggers” per second Each triggered event is ~1 MByte in size Physicists work on analysis “channels”. Each institute will have ~10 physicists working on one or more channels; data for these channels should be cached by the institute server Physics data cache ~PBytes/sec ~622 Mbits/sec or Air Freight (deprecated) Tier2 Centre ~1 TIPS Caltech ~1 TIPS ~622 Mbits/sec Tier 0 Tier 1 Tier 2 Tier 4 1 TIPS is approximately 25,000 SpecInt95 equivalents Image courtesy Harvey Newman, Caltech 9

Grids Provide Global Resources To Enable e-Science 10

Grids can process vast datasets. Many HEP and Astronomy experiments consist of:  Large datasets as inputs (find datasets)  “Transformations” which work on the input datasets (process)  The output datasets (store and publish) The emphasis is on the sharing of these large datasets Workflows of independent program can be parallelized. Montage Workflow: ~1200 jobs, 7 levels NVO, NASA, ISI/Pegasus - Deelman et al. Mosaic of M42 created on TeraGrid = Data Transfer = Compute Job 11

For Example: Digital Astronomy Digital observatories provide online archives of data at different wavelengths Ask questions such as: what objects are visible in infrared but not visible spectrum?

Virtual Organizations Groups of organizations that use the Grid to share resources for specific purposes Support a single community Deploy compatible technology and agree on working policies  Security policies - difficult Deploy different network accessible services:  Grid Information  Grid Resource Brokering  Grid Monitoring  Grid Accounting 13

The Grid Middleware Stack (and course modules) Grid Security Infrastructure (M4) Job Management (M2) Data Management (M3) Grid Information Services (M5) Core Globus Services (M1) Standard Network Protocols and Web Services (M1) Workflow system (explicit or ad-hoc) (M6) Grid Application (M5) (often includes a Portal) 14

Grids consist of distributed clusters Grid Client Application & User Interface Grid Client Middleware Resource, Workflow & Data Catalogs 15 Grid Site 2: Sao Paolo Grid Service Middleware Compute Cluster Grid Storage Grid Protocols Grid Site 1: Fermilab Grid Service Middleware Compute Cluster Grid Storage …Grid Site N: UWisconsin Grid Service Middleware Compute Cluster Grid Storage

Globus and Condor play key roles Globus Toolkit provides the base middleware  Client tools which you can use from a command line  APIs (scripting languages, C, C++, Java, …) to build your own tools, or use direct from applications  Web service interfaces  Higher level tools built from these basic components, e.g. Reliable File Transfer (RFT) Condor provides both client & server scheduling  In grids, Condor provides an agent to queue, schedule and manage work submission 16

Provisioning Grid architecture is evolving to a Service-Oriented approach. Service-oriented Grid infrastructure  Provision physical resources to support application workloads Appln Service Users Workflows Composition Invocation Service-oriented applications  Wrap applications as services  Compose applications into workflows “The Many Faces of IT as Service”, Foster, Tuecke, but this is beyond our workshop’s scope. See “Service-Oriented Science” by Ian Foster. 17

Job and resource management Workshop Module 2 18

July 11-15, 2005 Lecture3: Grid Job Management 19 GRAM – Globus Resource Allocation Manager “globusrun myjob …” Globus GRAM Protocol Organization A Organization B Gatekeeper & Job Manager Submit to LRM

GRAM provides a uniform interface to diverse cluster schedulers. User Grid VO Condor PBS LSF UNIX fork() GRAM Site A Site B Site C Site D 20

DAGMan Directed Acyclic Graph Manager DAGMan allows you to specify the dependencies between your Condor jobs, so it can manage them automatically for you. (e.g., “Don’t run job “B” until job “A” has completed successfully.”)‏

Data Management Workshop Module 3 22

Data management services provide the mechanisms to find, move and share data GridFTP  Fast, Flexible, Secure, Ubiquitous data transport  Often embedded in higher level services RFT  Reliable file transfer service using GridFTP Replica Location Service  Tracks multiple copies of data for speed and reliability Storage Resource Manager  Manages storage space allocation, aggregation, and transfer Metadata management services are evolving 23

Grids replicate data files for faster access Effective use of the grid resources – more parallelism Each logical file can have multiple physical copies Avoids single points of failure Manual or automatic replication  Automatic replication considers the demand for a file, transfer bandwidth, etc. 24

Grid Security Workshop Module 4 25

Grid security is a crucial component Problems being solved might be sensitive Resources are typically valuable Resources are located in distinct administrative domains  Each resource has own policies, procedures, security mechanisms, etc. Implementation must be broadly available & applicable  Standard, well-tested, well-understood protocols; integrated with wide variety of tools 26

National Grid Cyberinfrastructure Workshop Module 5 27

Diverse job mix Open Science Grid  50 sites (15,000 CPUs) & growing  400 to >1000 concurrent jobs  Many applications + CS experiments; includes long-running production operations  Up since October

TeraGrid provides vast resources via a number of huge computing facilities. 29

To efficiently use a Grid, you must locate and monitor its resources. Check the availability of different grid sites Discover different grid services Check the status of “jobs” Make better scheduling decisions with information maintained on the “health” of sites 30

Virtual Organization (VO) Concept VO for each application or workload Carve out and configure resources for a particular use and set of users

OSG Resource Selection Service: VORS 32

Grid Workflow Workshop Module 6 33

A typical workflow pattern in image analysis runs many filtering apps. Workflow courtesy James Dobson, Dartmouth Brain Imaging Center 34

Swift scripts: Parallelism via foreach type imagefile; (imagefile output) flip(imagefile input) { app { convert } imagefile observations[ ] ; imagefile flipped[ ] ; foreach obs.i in observations { flipped[i] = flip(obs); } Name outputs based on inputs Process all dataset members in parallel

Conclusion: Why Grids? New approaches to inquiry based on  Deep analysis of huge quantities of data  Interdisciplinary collaboration  Large-scale simulation and analysis  Smart instrumentation  Dynamically assemble the resources to tackle a new scale of problem Enabled by access to resources & services without regard for location & other barriers 36

Based on: Grid Intro and Fundamentals Review Dr. Gabrielle Allen Center for Computation & Technology Department of Computer Science Louisiana State University 37