Presentation is loading. Please wait.

Presentation is loading. Please wait.

TeraGrid Information Services: Building on Globus MDS4

Similar presentations


Presentation on theme: "TeraGrid Information Services: Building on Globus MDS4"— Presentation transcript:

1 TeraGrid Information Services: Building on Globus MDS4
NSF TeraGrid Review January 10, 2006 TeraGrid Information Services: Building on Globus MDS4 John-Paul “JP” Navarro TeraGrid Grid Infrastructure Group “GIG” Area Co-Director for Software Integration University of Chicago, Argonne National Laboratory GlobusWorld May 13, 2008 Good afternoon, I’m JP Navarro I’m a member of the TeraGrid’s Grid Infrastructure Group where I focus on Software Integration and Information Services activities This talk will introduce the TG’s information services motivation, goals, architecture, current capabilities, and plans and how these build on Globus MDS4 Charlie Catlett

2 What is the TeraGrid? 1 NSF funded facility with
NSF TeraGrid Review What is the TeraGrid? January 10, 2006 1 NSF funded facility with 11 resource providers and several other partners Grid Infrastructure Group (UChicago) UW UC/ANL PSC NCAR PU NCSA UNC/RENCI Caltech IU ORNL Tennessee USC/ISI SDSC LSU TACC Resource Provider (RP) Software Integration Partner Network Hub May 13, 2008 Globus World 2008 Charlie Catlett

3 Grid Infrastructure Group (UChicago)
NSF TeraGrid Review What is the TeraGrid? January 10, 2006 Operating a coordinated high-performance compute, network, storage, and visualization infrastructure Grid Infrastructure Group (UChicago) UW UC/ANL PSC NCAR PU NCSA UNC/RENCI Caltech IU ORNL Tennessee USC/ISI SDSC LSU TACC Resource Provider (RP) Software Integration Partner Network Hub May 13, 2008 Globus World 2008 Charlie Catlett

4 Open Scientific Discover
NSF TeraGrid Review What is the TeraGrid? January 10, 2006 To enable Open Scientific Discover Grid Infrastructure Group (UChicago) UW UC/ANL PSC NCAR PU NCSA UNC/RENCI Caltech IU ORNL Tennessee USC/ISI SDSC LSU TACC Resource Provider (RP) Software Integration Partner Network Hub May 13, 2008 Globus World 2008 Charlie Catlett

5 One facility in what sense?
NSF TeraGrid Review One facility in what sense? January 10, 2006 Common People Interfaces, examples: Request allocations & accounts “POPS” Identify available resources Learn how to use resources Find resource status Ask for help Community events Events Helpdesk User Documentation & Knowledge Base What does the TeraGrid provide as a Grid that is different from what independent facilities would offer 1st were providing COMMON interfaces intended for PEOPLE POPS: single interface to request allocations on any or all resources “roaming” User Portal: standard user aware interface User Documentation: public documentation and information Proposal System “POPS” User Portal May 13, 2008 Globus World 2008 Charlie Catlett

6 (Tera)Grid in what sense?
NSF TeraGrid Review (Tera)Grid in what sense? January 10, 2006 Coordinated/Standard (User) Software Interfaces Coordinated software Coordinated development & runtime Unix environment Standardized (Grid) service interfaces Grid Services Remote Login (GSI-OpenSSH) Coordinated Software Unix Environment Data Movement (GridFTP, RFT) Grid clients & tools Data Management (SRB) Development tools & languages What else does the TeraGrid provide as a Grid that is different from what independent facilities would offer The 2nd major thing we provide providing coordinated/standard software interface Remote Execution (GRAM) Communication, Data, and Math tools & libraries Information/Discovery (MDS4, Tomcat, Apache2) May 13, 2008 Globus World 2008 Charlie Catlett

7 How can users keep it all straight?
NSF TeraGrid Review Challenges January 10, 2006 Integrating many more types of infrastructure resources, Including more services providers, Providing more types of services, With more resources, providers, and services diversity, Supporting specialization. The TG started as 4 institutions running the same hardware, OS, and software Since then we’ve diversified to 11 resource providers, ~dozen platforms (Linux and vendor Unix), 20 machines, traditional and community/gateway users Expanding variety of service types, service providers, and specialization amongst the service providers How can users keep it all straight? May 13, 2008 Globus World 2008 Charlie Catlett

8 Meeting the Challenge 1 Re-architect CTSS
NSF TeraGrid Review Meeting the Challenge 1 Re-architect CTSS January 10, 2006 Took Coordinated TeraGrid Software and Services “CTSS” v3: Monolithic coordinated capabilities In production since mid 2006 Re-architected into CTSS v4 “capability kits”: Small required core integration kit with minimum needed to integrate resources Plus ~10 optional user capability kits (at least 1 required) CTSS v3 (with exceptions) CTSS v4 TeraGrid Core Integration Remote Login Data Movement Data Management To address this challenge the TeraGrid took a two (2) prong approach. First we took the Coordinated TeraGrid Software and Services Application Development & Runtime Science Workflow Remote Compute Wide Area GPFS Parallel Application (MPI) Visualization May 13, 2008 Globus World 2008 Charlie Catlett

9 Meeting the Challenge 2 Information Services
NSF TeraGrid Review Meeting the Challenge 2 Information Services January 10, 2006 Information Services Vision: Create a coordinated way for TeraGrid participants to publish about the services they offer, Create a way for the TeraGrid to aggregate and index the information from TeraGrid participants, and to publish this information to the public in a form that can easily be used by user software, user interfaces, and TeraGrid service providers themselves to discover capabilities and how to access them Our motivating vision major improvements to how TeraGrid Service Providers communicate information about their service offerings to the User Community May 13, 2008 Globus World 2008 Charlie Catlett

10 Information Services Design Goals
NSF TeraGrid Review Information Services Design Goals January 10, 2006 Applies Grid concepts to original information publishing Publishing is the responsibility of the information owner Publishing is done using standard (content) schemas Publishing thru standard interfaces regardless of content and where the data comes from Publishing services should be available globally (subject to authentication/authorization) Information owners publish to EVERYONE, not just the TeraGrid Publishing is a grid service Applies Grid concepts to aggregated information publishing Aggregation uses standard information services interfaces to retrieve information Publishing aggregated information is done exactly like original information publishing This is how a collaboration, such as the TeraGrid, aggregates participant information Applies Grid concepts to querying information Querying can use standard interfaces regardless of content Querying can use standard interfaces for original or aggregated information Our motivating vision major improvements to how TeraGrid Service Providers communicate information about their service offerings to the User Community May 13, 2008 Globus World 2008 Charlie Catlett

11 High-Level Architecture
NSF TeraGrid Review High-Level Architecture January 10, 2006 TeraGrid Wide Information Services Apache 2.0 WS/REST HTTP GET Clients Cache Tomcat WebMDS TeraGrid Wide Respositories WS/SOAP Clients WS MDS4 Service Provider Information Services WS/SOAP WS MDS4 Clients Adapter Local Info May 13, 2008 Globus World 2008 Charlie Catlett

12 Information Services Tooling
NSF TeraGrid Review Information Services Tooling January 10, 2006 WS/SOAP (Globus 4.0.x MDS4) Benefits Indexing, Trigger Registration, Publish, Subscribe Security/Authorization Robust WSRF interface Content XML WS/* (Tomcat 5.0, Apache 2.0) Very common web services platform Supports several web service interfaces (including simple) Supports multiple styles like REST, Web 2.0 Can be highly scalable Many formats: HTML, XHTML/XML, XML, RSS/Atom, … WebMDS (Globus 4.0.x) Live MDS4 content access XPath support XSLT transforms Many formats: HTML, XHTML/XML, XML, RSS/Atom May 13, 2008 Globus World 2008 Charlie Catlett

13 Service Provider vs TG Wide Services
NSF TeraGrid Review Service Provider vs TG Wide Services January 10, 2006 Content: Locally owned and published information Can come from existing local systems Services: 1 general purpose MDS service 1 remote execution MDS services May 13, 2008 Globus World 2008 Charlie Catlett

14 Service Provider vs TG Wide Services
NSF TeraGrid Review Service Provider vs TG Wide Services January 10, 2006 Content: Aggregate/index service provider information Plus central information (TeraGrid databases) Cached Authenticated registrations Services: Several redundant servers (>99.5% availability) Each server: Information caching programs MDS4 index services (WS/SOAP) WebMDS/Tomcat, Apache 2.0, … services (WS/REST) Services publish in: HTML XML CSV (Atom, JSON, RSS being prototyped) May 13, 2008 Globus World 2008 Charlie Catlett

15 High-Availability Design
NSF TeraGrid Review High-Availability Design January 10, 2006 TeraGrid Wide Information Services Clients info.teragrid.org Service Provider Information Services (commercial XEN server) info.dyn.teragrid.org TeraGrid Dynamic DNS This is both a high-availability and high-throughput design Primary is a commercially hosted XEN image (TeraGrid physical server) Server failover propagates globally in 15 minutes Static paths Dynamic paths May 13, 2008 Globus World 2008 Charlie Catlett

16 Caching Design Goal MDS configuration: Caching programs:
NSF TeraGrid Review Caching Design January 10, 2006 Goal Information persistence when service providers are down and when central information services go down MDS configuration: Service provider register their existence Central MDS do not aggregate automatically Central MDS publishes from local file-system cache Caching programs: External and asynchronous to MDS Query registered Service provider MDSs, and Other configured MDSs Store query results in file-system Replaces previously cached data only with new data Not if downstream is down Or if downstream returns an error Or if downstream doesn’t return valid data May 13, 2008 Globus World 2008 Charlie Catlett

17 Primary MDS4 features used
NSF TeraGrid Review Primary MDS4 features used January 10, 2006 Registration: Parallel upstream registration to multiple central servers Authenticated/authorized upstream registration Aggregation: Some information is automatically aggregated (WS services registration and default GLUE) Information providers: Useful RP custom information providers Scheduling load and queue contents (User Portal) CTSS 4 Capability kit registration TeraGrid ID and description cross-reference (TGCDB) Secure MDS: Authenticated/authorized queries (non-anonymous) Multiple index services in same container DefaultIndexService AND SecureIndexService WebMDS XSLT, Xpath May 13, 2008 Globus World 2008 Charlie Catlett

18 Information Services Users
NSF TeraGrid Review Information Services Users January 10, 2006 User Documentation User Portal Inca Testing Harness Gateways We our eating our own dogfood, publishing for internal consumption Users and developers like yourself are our also our target Peer Grids User Applications info.teragrid.org May 13, 2008 Globus World 2008 Charlie Catlett

19 CTSS Capability Kit Availability
NSF TeraGrid Review CTSS Capability Kit Availability January 10, 2006 Charlie Catlett

20 Where are the GridFTP services?
NSF TeraGrid Review Where are the GridFTP services? January 10, 2006 Charlie Catlett

21 Queue Contents in User Portal
NSF TeraGrid Review Queue Contents in User Portal January 10, 2006 Charlie Catlett

22 Inca Capability Kit Testing
NSF TeraGrid Review Inca Capability Kit Testing January 10, 2006 Charlie Catlett

23 CTSS 4 Capability Kits For each capability kit on each resource
NSF TeraGrid Review CTSS 4 Capability Kits January 10, 2006 For each capability kit on each resource Current support level, and target support level Development, Testing, Production Support organization and contact Inca status URL Multiple version of a kit with different support levels Generic capabilities, web and non-web, local, central, peer grid… May 13, 2008 Globus World 2008 Charlie Catlett

24 CTSS 4 Capability Kit Software
NSF TeraGrid Review CTSS 4 Capability Kit Software January 10, 2006 For each kit software component on each resource Name, version, how to access it Multiple versions of a single component May 13, 2008 Globus World 2008 Charlie Catlett

25 CTSS 4 Capability Kit Services
NSF TeraGrid Review CTSS 4 Capability Kit Services January 10, 2006 For each kit service on each resource Name, type, version, and Endpoint (contact location) GSI OpenSSH, GridFTP, SRB servers, PreWS & WS GRAM, MDS4 Multiple services of the same type May 13, 2008 Globus World 2008 Charlie Catlett

26 What’s in Development? Expanded content
NSF TeraGrid Review What’s in Development? January 10, 2006 Expanded content TeraGrid Gateways list Extended GridFTP service information (striping, bandwidth, etc) Local HPC Software (Meta)Scheduling support information Information Services Metadata Enhanced caching, aggregation, and storage Improved information access tginfo, universal command line query tool WS/REST, Web 2.0 style information access Multiple formats: CSV TEXT, XML, JSON, RSS/Atom, … GLUE 2.0 Usage metrics Coordinated software and services -> CTSS Local (uncoordinated) HPC software May 13, 2008 Globus World 2008 Charlie Catlett

27 The Team Information Services design, development, and support team
NSF TeraGrid Review The Team January 10, 2006 Information Services design, development, and support team Eric Blau Jason Brechin Lee Liming JP Navarro Laura Pearlman Information Services downstream developers and consumers End-to-end data transfer optimization PSC (Derek Simmel, others) INCA (Kate Ericson) Operations (Jason Brechin, Tony Rimovsky) User Documentation (Mike Dwyer, Diana Diehl) User Portal (Maytal Dahan) Other lurkers… May 13, 2008 Globus World 2008 Charlie Catlett

28 More Information Information Services main page:
NSF TeraGrid Review More Information January 10, 2006 Information Services main page: (links to content and documentation) User Documentation (CTSS 4 kits, software, services): User Portal (scheduler load & queue contents): Inca Monitoring Framework May 13, 2008 Globus World 2008 Charlie Catlett


Download ppt "TeraGrid Information Services: Building on Globus MDS4"

Similar presentations


Ads by Google