Presentation is loading. Please wait.

Presentation is loading. Please wait.

Www.geongrid.org CYBERINFRASTRUCTURE FOR THE GEOSCIENCES GeonGrid Current Status Karan Bhatia SDSC.

Similar presentations


Presentation on theme: "Www.geongrid.org CYBERINFRASTRUCTURE FOR THE GEOSCIENCES GeonGrid Current Status Karan Bhatia SDSC."— Presentation transcript:

1 www.geongrid.org CYBERINFRASTRUCTURE FOR THE GEOSCIENCES GeonGrid Current Status Karan Bhatia SDSC

2 www.geongrid.org CYBERINFRASTRUCTURE FOR THE GEOSCIENCES GRID Center: GRIDS Community Workshop, June 23-24,2005, Chicago, Ill Current Status Middle of second year of 5 year ITR 16 partner’s spread across the US Additional partners (including international sites) likely to be added. Hardware fully deployed operating as a grid system (see initial performance results) Version 1.0 of Software Stack developed and deployed Software management system deployed and working Production services (SDSC), others actively developing software

3 www.geongrid.org CYBERINFRASTRUCTURE FOR THE GEOSCIENCES GRID Center: GRIDS Community Workshop, June 23-24,2005, Chicago, Ill Design Principles CI: Support the “day to day” conduct of science (e-science), in addition to “hero” computations The “two-tier” approach Use best practices, including use of commercial tools and open standards, where applicable… …while developing advanced technology, and doing CS research An equal partnership IT works in close conjunction with science Create shared “science infrastructure” Integrated online databases, with advanced search and query engines Online models, robust tools and applications Leverage from other intersecting projects Much commonality in the technologies, regardless of science disciplines, e.g. BIRN, SEEK, and many others

4 www.geongrid.org CYBERINFRASTRUCTURE FOR THE GEOSCIENCES GRID Center: GRIDS Community Workshop, June 23-24,2005, Chicago, Ill Design Principles: empower vs. control From “Revolutionizing Science and Engineering Through Cyberinfrastructure”, report of the NSF Blue Ribbon Advisory Panel on Cyberinfrastructure (aka. The Atkins report): “that facilitates the development of new applications, allows applications to interoperate across institutions and disciplines, insures that data and software acquired at great expense are preserved and easily available, and empowers enhanced collaboration over distance, time and disciplines”. From “The Anatomy of a Grid”, Foster, Kesselman, Tuecke: “flexible, Secure, Co-ordinated resource sharing among dynamic collections of individuals, institutions, and resources.” “highly controlled … with resource providers and consumers defining clearly and carefully just what is shared, who is allowed to share, and the conditions under which sharing occurs.”

5 www.geongrid.org CYBERINFRASTRUCTURE FOR THE GEOSCIENCES GRID Center: GRIDS Community Workshop, June 23-24,2005, Chicago, Ill System Deployment Standard reference systems for Geon PoP point-of-presence dell 2650’s Additional resources can be attached to the PoP Software Deployment Centralized software stack definition Locally controlled extensions Application development and integration Centralized web-based portal for access to core resources Local portals provide customization into users’ home environment. Security Centralized user account policies Locally defined “non-grid” user policies

6 www.geongrid.org CYBERINFRASTRUCTURE FOR THE GEOSCIENCES GRID Center: GRIDS Community Workshop, June 23-24,2005, Chicago, Ill Technical Areas GEONgrid systems and Portals Balancing coordination and centralization with distribution and autonomy – a CI issue Examples: Common GEON “software stack”, with local customizations PI’s responsible for different software packages via “Central” mechanism GEON “reference” portal bundled as part of standard distribution, with standard collection of portlets Satellite portals at PI/partner sites, with customized portlets Ontology-based Data Registration, Data Integration, Ontology-based “Smart Search” Geo-Ontology development Registering data to ontologies Data Integration Carts©: ad hoc (“on the fly”) integration using ontologies Map Integration Presenting geoscience information on GIS layers is useful and intuitive Extending map integration based on shapefiles to … Knowledge-based integration of Web Mapping Services Scientific workflows – Kepler A visual scientific data analysis and modeling environment Mineral classification, gravity modeling LiDAR data ingestion and data analysis pipelines Grid-enabled High Performance Computing applications E.g. SYNSEIS Submit job to TeraGrid from GEON Portal, using GSI authentication Visualization “GEON Browser”, Viz services at GEON portal, Augmented reality displays

7 www.geongrid.org CYBERINFRASTRUCTURE FOR THE GEOSCIENCES GRID Center: GRIDS Community Workshop, June 23-24,2005, Chicago, Ill Conclusions, Challenges, etc. Social Challenges More communication with partners needed Clear delineation of responsibilities regarding system administration of partner resources Clear communication of how to effectively utilize the local Geon resources Why do partners need to host Geon machines? What do the partners do with the hosted machines? Technical Challenges Software stack very system/developer oriented not too much Geoscience specific software included Developing tools for effective management of distributed resources Integration with additional non-dedicated resources at partner sites includes Teragrid integration Software stability: Current grid software is unstable, incomplete, difficult to deploy, and difficult to use. Not much guidance on how building production- oriented SOA

8 www.geongrid.org CYBERINFRASTRUCTURE FOR THE GEOSCIENCES GRID Center: GRIDS Community Workshop, June 23-24,2005, Chicago, Ill Outline of Talk Design Principles and Overview Hardware Deployment Software Stack Software Development Activities ☛

9 www.geongrid.org CYBERINFRASTRUCTURE FOR THE GEOSCIENCES GRID Center: GRIDS Community Workshop, June 23-24,2005, Chicago, Ill Hardware Resources Partners: Arizona State University, Bryn Mawr College, DLESE (Digital Library for Earth System Education), Energy and Geoscience Institute (EGI), University of Utah, Penn State University, Rice University, San Diego State University, San Diego Supercomputer Center, UNAVCO, Inc., University of Arizona, University of Idaho, University of Missouri, University of Texas at El Paso, University of Utah, and Virginia Tech. Industrial partners include ESRI. Additional sites that are considering becoming partners include: University of California Davis, Columbia (SESAR), University of Hyderabad, India, and Japanese National Institute of Advanced Industrial and Science Technology. Geon-Dedicated Hardware ◆ 15 Geon PoP (one for each partner site) ◆ 3 Geon Data nodes (Idaho, ASU and SDSC) ◆ 3 small 4-node clusters (ASU, UTEP, Missouri) ◆ 1 medium HP Itanium-based 9-node cluster (SDSC) ◆ 11-node development cluster (SDSC) ◆ 2 production and 1 beta operations server (SDSC) External Resources ◆ 30,000 SUs (execution units) for use on Teragrid cluster ◆ 12 TB of online SAN disk space, and 10 TB of additional tape archive at SDSC’s production data facility

10 www.geongrid.org CYBERINFRASTRUCTURE FOR THE GEOSCIENCES GRID Center: GRIDS Community Workshop, June 23-24,2005, Chicago, Ill Point-of-Presence (PoP) Each partner runs a PoP node Dell 2650, dual pentium 4, rack-mounted server system Purpose: 1. provide development platform for partners, 2. provide local users with customized view of Geon resources 3. provide a “gateway” for accessing additional local resources, Partner PoP geon software stack additional local resources local developer GeonSystem local users

11 www.geongrid.org CYBERINFRASTRUCTURE FOR THE GEOSCIENCES GRID Center: GRIDS Community Workshop, June 23-24,2005, Chicago, Ill Local Resources (example) ASU: 8 beowulf clusters (up to 1024 processors) 48 processor Cray XD1 10 GBit fiber across campus (dedicated), CENIC and National Lambda Rail ASU Decision Theater, immersive visualization environment, 270 degree projection.6 PB tape, up to 50TB SAN storage UTEP: 128-node cluster (soon), 36-node (dual core) Cray XD1, 16-processor IBM SMP, SunFire V880 SMP 5 TB disk storage (to grow) SDSC: DataStar, Teragrid (very high-end computing) CalIT2/SDSC Synthesis Center (visualization), large tiled displays, 3-d capabilities 12 TB SAN, Storage Resource Broker

12 www.geongrid.org CYBERINFRASTRUCTURE FOR THE GEOSCIENCES GRID Center: GRIDS Community Workshop, June 23-24,2005, Chicago, Ill

13 www.geongrid.org CYBERINFRASTRUCTURE FOR THE GEOSCIENCES GRID Center: GRIDS Community Workshop, June 23-24,2005, Chicago, Ill Outline of Talk Design Principles and Overview Hardware Deployment Software Stack Software Development Activities ☛

14 www.geongrid.org CYBERINFRASTRUCTURE FOR THE GEOSCIENCES GRID Center: GRIDS Community Workshop, June 23-24,2005, Chicago, Ill Geon Software Stack Software Stack Components Gridsphere 2.0.2: Portal framework NWS 2.10.1 INCA 10.2 Condor (Rocks Roll), PBS (Rocks Roll) NMI Globus 5.1 OGSA 3.2.1 & Axis 1.1 OGSA-DAI 4.0 Tomcat 5.0.28 Postgres 7.4.2 PostGIS 0.9 Proj 4.4.8, Geos 2.0.1 JDK 1.4.2_04 Apache Ant 1.6.2 Samba 3.11 Tripwire

15 www.geongrid.org CYBERINFRASTRUCTURE FOR THE GEOSCIENCES GRID Center: GRIDS Community Workshop, June 23-24,2005, Chicago, Ill Local Extensions Partner sites can extend software stack with additional software packaged as “rolls” “rolls” can be used by any partner

16 www.geongrid.org CYBERINFRASTRUCTURE FOR THE GEOSCIENCES GRID Center: GRIDS Community Workshop, June 23-24,2005, Chicago, Ill

17 www.geongrid.org CYBERINFRASTRUCTURE FOR THE GEOSCIENCES GRID Center: GRIDS Community Workshop, June 23-24,2005, Chicago, Ill Network Weather Service

18 www.geongrid.org CYBERINFRASTRUCTURE FOR THE GEOSCIENCES GRID Center: GRIDS Community Workshop, June 23-24,2005, Chicago, Ill INCA/Grasp Grid Benchmarks 100MB file Transfer SDSC -> utep -> ASU -> DLESE -> SDSC Chronos -> rice -> UAZ -> Chronos

19 www.geongrid.org CYBERINFRASTRUCTURE FOR THE GEOSCIENCES GRID Center: GRIDS Community Workshop, June 23-24,2005, Chicago, Ill performance, failures, etc. few hardware failures (so far)... very important to understand the system in order to develop algorithms for data management, service failover, etc. working with henri casanova, allan snavely, shava smallen (sdsc) grail lab on benchmarking working with keith marzullo (ucsd)

20 www.geongrid.org CYBERINFRASTRUCTURE FOR THE GEOSCIENCES GRID Center: GRIDS Community Workshop, June 23-24,2005, Chicago, Ill Outline of Talk Design Principles and Overview Hardware Deployment Software Stack Software Development Activities ☛

21 www.geongrid.org CYBERINFRASTRUCTURE FOR THE GEOSCIENCES GRID Center: GRIDS Community Workshop, June 23-24,2005, Chicago, Ill Partner Portals Why do the partners need a custom portal? local branding customized access to geon resources provide access to local information/services

22 www.geongrid.org CYBERINFRASTRUCTURE FOR THE GEOSCIENCES GRID Center: GRIDS Community Workshop, June 23-24,2005, Chicago, Ill GAMA web service-based credential management system with set of GridSphere Portlets (http://grid- devel.sdsc.edu/gama)http://grid- devel.sdsc.edu/gama wraps CACL (CA), MyProxy (cert repo), and CAS (?) looking at Naregi CA (AIST) & LDAP Supports partner portals Grid and Local accounts supported grid account import (sync) across sites cluster account integration Geon, Telescience, BIRN, Optiputer, Teragrid, UK eScience, Harvard CrimsonGrid, Pragma

23 www.geongrid.org CYBERINFRASTRUCTURE FOR THE GEOSCIENCES GRID Center: GRIDS Community Workshop, June 23-24,2005, Chicago, Ill Partner Centrals Problem: Single software stack not sufficient for diverse needs of partners Geon Systems group has no expertise in Software X or program Y Geon Partner can package and extend software stack in standard way packaged software can be used by any other Geon partner ASU and UTEP already have centrals up and running GMT, GRASS and other geo-specific software packaged

24 www.geongrid.org CYBERINFRASTRUCTURE FOR THE GEOSCIENCES GRID Center: GRIDS Community Workshop, June 23-24,2005, Chicago, Ill Conclusions, Challenges, etc. Social Challenges More communication with partners needed Clear delineation of responsibilities regarding system administration of partner resources Clear communication of how to effectively utilize the local Geon resources Why do partners need to host Geon machines? What do the partners do with the hosted machines? Technical Challenges Software stack very system/developer oriented not too much Geoscience specific software included Developing tools for effective management of distributed resources Integration with additional non-dedicated resources at partner sites includes Teragrid integration Software stability: Current grid software is unstable, incomplete, difficult to deploy, and difficult to use. Not much guidance on how building production- oriented SOA


Download ppt "Www.geongrid.org CYBERINFRASTRUCTURE FOR THE GEOSCIENCES GeonGrid Current Status Karan Bhatia SDSC."

Similar presentations


Ads by Google