Presentation is loading. Please wait.

Presentation is loading. Please wait.

Page 1 LAITS Laboratory for Advanced Information Technology and Standards Duh 7/10/03 The GMU Geospatial Grid Technology Development and Application Project.

Similar presentations


Presentation on theme: "Page 1 LAITS Laboratory for Advanced Information Technology and Standards Duh 7/10/03 The GMU Geospatial Grid Technology Development and Application Project."— Presentation transcript:

1 Page 1 LAITS Laboratory for Advanced Information Technology and Standards Duh 7/10/03 The GMU Geospatial Grid Technology Development and Application Project Liping Di Laboratory for Advanced Information Technology and Standards (LAITS) George Mason University

2 Page 2 LAITS Laboratory for Advanced Information Technology and Standards Duh 7/10/03 Overall Objectives Develop the geospatial extensions of Grid technology to make it geospatial enable. Develop virtual geospatial data and information services in the Grid environment. Demonstrate the geospatial Grid technology in Earth Observation (EO) environment at NASA data pools. Contribute technology, software, and the data pool application to the CEOS Grid testbed

3 Page 3 LAITS Laboratory for Advanced Information Technology and Standards Duh 7/10/03 The Grid Technology The Grid technology is developed for securely sharing computational resources within an virtual organization. –Computer CPU cycles –Storage –Networks –Data, Information, algorithms, software, services. It was originally motivated and supported from sciences and engineering requiring high-end computing, for sharing geographically distributed high-end computing resources. The core of the technology is the the open source middleware called Globus Toolkit. –The latest version of Globus is version 3.0 which implements the Open Grid Service Architecture (OGSA)

4 Page 4 LAITS Laboratory for Advanced Information Technology and Standards Duh 7/10/03 The Grid Architecture

5 Page 5 LAITS Laboratory for Advanced Information Technology and Standards Duh 7/10/03 Why Grid is useful to the EO community? Earth observation community is one of the key communities for collecting, managing, processing, archiving and distribution geospatial data and information. Because of the large volumes of EO data and geographically scattered receiving and processing facilities, the EO data and associated computational resources are naturally distributed. The multi-discipline nature of global change research and remote sensing applications requires the integrated analysis of huge volume of multi-source data from multiple data centers. This requires sharing of both data and computing powers among data centers. Therefore, Grid is an ideal technology for EO community.

6 Page 6 LAITS Laboratory for Advanced Information Technology and Standards Duh 7/10/03 Why Needs the geospatial extensions of Grid Geospatial data and information are significantly different from those in other disciplines. –Very complex and diverse. Formats, projection, resolutions. Hyper-dimensions: spatial, temporal, spectral, thematic. Raster vs. vectors –Large data volume more than 80% of data human beings has collected is spatial data. The geospatial community has developed a set of standards specifically for geospatial data and information that users have been familiar with. (e.g., OGC, ISO, FGDC). Grid technology is developed for general sharing of computational resources and not aware of the specialty of geospatial data. In order to make Grid technology applicable to geospatial data, we have to do the geospatial domain-specific extensions.

7 Page 7 LAITS Laboratory for Advanced Information Technology and Standards Duh 7/10/03 Areas of Extensions Internally in the Grid, it have to be spatially aware. –Extend Globus toolkit to handle the spatial, spectral, temporal, thematic based spatial data and information management. –Develop enough Grid-enable tools for geospatial data handling/services. Must provide data/information access and services interfaces that are standard in the geospatial community. –The Open GIS Consortiums Web Data Access/Service interfaces (e.g., OGC WCS, WMS, WFS, and WRS).

8 Page 8 LAITS Laboratory for Advanced Information Technology and Standards Duh 7/10/03 The OGC Web Service Specifications The Web Coverage Services (WCS) specification: defines the standard interfaces between web-based clients and servers for accessing coverage data. –All imagery type of remote sensing data is coverage data. The Web Feature Services (WFS) specification: defines the standard interfaces between web-based clients and servers for accessing feature-based geospatial data. –vector and point data are feature data. The Web Map Services (WMS) specification: define the standard interfaces for accessing and assembling maps from multiple servers. –visualization of geospatial data The Web Registries Services (WRS) specification: defines the interfaces between web-based clients and servers for finding the required data or services from registries. WCS, WFS, WRS, and WMS form the foundation for the interoperable geospatial data access and service environment

9 Page 9 LAITS Laboratory for Advanced Information Technology and Standards Duh 7/10/03

10 Page 10 LAITS Laboratory for Advanced Information Technology and Standards Duh 7/10/03 Data Access Sequences in the Data Grid

11 Page 11 LAITS Laboratory for Advanced Information Technology and Standards Duh 7/10/03 OGC Interfaces to the Geospatial Data Grid OGC Client WRS Server inter- faces WCS/ WFS/ WMS Server inter- faces Metadata Catalog Service MCS Web Server MCS Database Replica Location Service Replica index node Replica cat. Physical Storage System/files OGC WRS Query MCS Query Logical Filenames Physical locations WRS Results OGC data protocols Physical locations Data Transformed Data

12 Page 12 LAITS Laboratory for Advanced Information Technology and Standards Duh 7/10/03 Virtual datasets A virtual dataset is a dataset that: –not exist in a data and information system –The system knows how to create it on-demand. –A virtual dataset, once created, can be kept for fulfilling the same request from next users. The client/data user will not know the difference between a real dataset and a virtual dataset. Advantages of virtual datasets A virtual dataset can be produced (materialized) by –running a program dedicated to the production of the virtual dataset (dedicated program approach). –running a series of service modules, each one takes care of a small step of the materialization of the virtual dataset (service approach).

13 Page 13 LAITS Laboratory for Advanced Information Technology and Standards Duh 7/10/03 The Service Approach to Virtual Datasets A service is defined as self-contained, self-describing, modular applications that can be published, located, and dynamically invoked across a network. –It performs functions, which can be anything from simple requests to complicated business processes. –Once a service is deployed, other applications (and other services) can discover and invoke the deployed service. A service can be implemented in the Web environment, called a web service, or in the Grid environment, called a Grid service. Standards on service discovery, declaration, binding, and invocation allow dynamically chaining individual services across a network together to fulfill a complex task. A virtual dataset, in the service environment, basically is a service chain that describes steps to be taken to produce the virtual dataset. With enough elementary service models, it is possible to provide unlimited numbers of virtual datasets by just creating the service chains.

14 Page 14 LAITS Laboratory for Advanced Information Technology and Standards Duh 7/10/03 Geo-object, Geo-tree, Virtual Dataset, Geospatial Models archived geo-object user geo-object Intermediate geo-object Automated data transformation service(WCS/WFS) no servicedata service modeling and virtual data services User Requested User Obtained Geospatial web/Grid services

15 Page 15 LAITS Laboratory for Advanced Information Technology and Standards Duh 7/10/03 User Creation of Geospatial Models A user-requested products maybe not exist both virtually and no virtually. If the user knows the thought process to create the data products from lower-level inputs step-by-step (the logical geospatial modeling) –With help of a good user interface and the availability of service modules and models/submodels, the user can construct a geospatial model/virtual data product interactively. –The system then can produce the virtual data product for the user. –The user-created model can be incorporated into the system as a part of the virtual datasets the system can provide. This allows the system to grow capabilities with time. Advantages –allows users to obtain the ready-to-use scientific information instead of the raw data, significantly reducing the data traffic between the users and the geospatial Grid. –allows users to explore huge resources available at a data Grid and to conduct tasks that they never be able to conduct before.

16 Page 16 LAITS Laboratory for Advanced Information Technology and Standards Duh 7/10/03 Research Issues in Virtual Geospatial Data Services Representation of Geo-Tree. –The Geotree/model description Need a language to describe the geo-tree and subtree Only logical and thematic description of the tree. Not attach to individual physical files, use virtual data types Service module cataloging –Need a catalog in MCS to catalog all service modules available (modules not necessary in the same system) –Describe the inputs, outputs, and how to invoke the service –Use for both manually or automatically constructing the geo-tree. –Use for instantiation of the virtual dataset (to create the workflow)

17 Page 17 LAITS Laboratory for Advanced Information Technology and Standards Duh 7/10/03 Research Issues in Virtual Geospatial Data Services Geo-tree/model database –Contains all geo-trees available Virtual dataset cataloging –Need to catalog all virtual datasets in a geo-tree (the root of the geo-tree and all intermediate datasets ). –Catalog the virtual datasets with the real data set, use the same description as the real dataset in the catalog except for two things: no description of spatial and temporal coverage include a point to the entry to the geotree database where the specific geotree is located.

18 Page 18 LAITS Laboratory for Advanced Information Technology and Standards Duh 7/10/03 Additional Required Functional Components Logical Instantiation –This component will check if a virtual data can be materialized against a specific user search. –Generate logical filenames which are unique and different from the real logical filenames. –Generate the logical workflow per request of physical Instantiation (filenames in the workflow are logical names) Physical Instantiation –This component will produce the executable workflow when user actually requests the virtual dataset. –What workflow language should be used? Workflow execution manager –Manage the execution of the workflow for materializing the virtual dataset. –Return the materialized dataset to users.

19 Page 19 LAITS Laboratory for Advanced Information Technology and Standards Duh 7/10/03 Virtual Data Services In the Geospatial Data Grid OGC Client WRS Server WCS/ WFS/ WMS Server Metadata Catalog Service MCS Web Server MCS Database Replica Location Service Replica index node Replica cat. Physical Storage System/files OGC WRS Query MCS Query LF PL WRS Results OGC data protocols PL Data Transformed Data GeoTree Lib Module cata. logical instant : Matched virtual datasets 2: logically instanced virtual filenames (LIVF) 3. logical workflow 4. Physical workflow 5. Data Physical Inst Workflow Execution Manager 2 5 LF: Logical filename Yellow: New component PL: Physical Location Light Yellow: Modified MCS component LF

20 Page 20 LAITS Laboratory for Advanced Information Technology and Standards Duh 7/10/03 The Development Team PI, Liping Di, LAITS/GMU. Co-I, Williams Johnston NASA Ames and DOE LBNL. Co-I, Deans Williams, DOE LLNL.

21 Page 21 LAITS Laboratory for Advanced Information Technology and Standards Duh 7/10/03 Implementation Plan The first phase is the testbed and initial integration, including the setup of the development environment, preliminary design of the integration, and implementation of WCS access to Grid- managed data. The second phase is the data naming and location transparency, which include the use of Data Grid and Replica Services (metadata catalogues, replication location management, reliable file transfer services, and network caches) to provide naming and location independence for data used by NWGISS and revising NWGISS to invoke such Grid services. –The approach to investigating the Data Grid and Replica Services will be to configure a Data Grid testbed. This will be followed by the integration of NWGISS data catalogs into a data Grid catalog and the investigation of naming approaches, followed by interfacing NWGISS with data generators and Data Grid Replica Location service The third phase is the virtual dataset research and development.

22 Page 22 LAITS Laboratory for Advanced Information Technology and Standards Duh 7/10/03 The development environment A prototype development environment has been setup at LAITS/GMU –Three machines2 Linux and 1 SunFire Unix servers. –Machines are linked through 100 Mb LAN. –External link to Internet through dedicated T1 line. The real development/demo environment is being set –GMU will purchase a server with 4-8 Tb of disk space. The machine will be hosted at NASA Goddard. –NASA AMES will provide a machine with 4-8 Tb disk space and 30 Tb near real-time storage device. –DOE LLNL will provide a machine with 2-4 Tb of disk space. –1 Gb/sec Internet link The machine will be populated with NASA EOS data –e.g., MODIS, ASTER, Landsat, MISR.

23 Page 23 LAITS Laboratory for Advanced Information Technology and Standards Duh 7/10/03 Current Status Most of the phase-one developments are near complete. –MCS has been extended to handle spatial, temporal, and parameter-based search by adding a layer on top of MCS. –WRS interface has been implemented on top of MCS for data search and discovery. –WCS server has been modified to access data within the virtual organization. –WRS and WCS are connected for searching and then deliver the one-demand data to users. A demonstration will show the on-demand access of Grid- managed EOS data through a OGC client. –You will not see the Grid because it supposes to work invisible to clients outside to the Grid.

24 Page 24 LAITS Laboratory for Advanced Information Technology and Standards Duh 7/10/03 Near-term Plan (within 6 months) Build the real development/test environment. Modify WCS server so that it can work as the client to another WCS server remotely located in other machine. –to fetch just the right amount of data to the requested machine within the Grid. Test the service concepts –Register services in MCS –Couple services with data enable to search available services associated with data enable to search available data with a given service Develop fundamental data services –Reformatting, subsetting/resampling –Georectification/reprojection –Supervised and unsupervised classification services

25 Page 25 LAITS Laboratory for Advanced Information Technology and Standards Duh 7/10/03 The NASA EOSDIS Data Pools The NASA EOSDIS project is implementing data pools that contains huge amount of remote sensing data on-line for users to directly and rapidly access. The data pools will be operated at each of nine NASAs distributed active archive centers (DAACs). Each data pool will provide discipline-specific EOSDIS data archived at the DAAC. DAACs are connected through the high-speed network Currently there are total four operational data pools at GSFC, Langley, EDC, and NSIDC. –Both data search through search criteria and data finding through browsing/drilling-down are provided. –ftp for data downloading. No data services is provided. OGC WCS interface is being implemented. –provide better data access than FTP.

26 Page 26 LAITS Laboratory for Advanced Information Technology and Standards Duh 7/10/03 The CEOS Grid Data Pool Application Deploy the geospatial Grid software developed by GMU- led team to data pools as one of CEOS Grid Applications –Initially at NASA Goddard DAAC. –Intend to expand to all data pools. The application will provide –secured sharing of computing resources among the data pools. –a single point of entry to all resources in the pools--location transparent. –geospatial standard-based data discovery and access. – Automatic data transformation services –Virtual data services –Interactive geospatial modeling, execution, and model sharing.

27 Page 27 LAITS Laboratory for Advanced Information Technology and Standards Duh 7/10/03 Contribution to the CEOS Grid Activities Share the technology, software, and experience with other CEOS Grid application projects Contribute the Grid data pool application to the CEOS Grid testbed for technology demonstration. The technology and the software created by this project can be used to create a CEOS-based Global EO Data and Information Grid –Support globally sharing the EO data, information, and/or computational resources. –Support International scientific and EO initiatives such as Integrated Global Observation System (IGOS). –Support the use of EO data/information in the developing countries for environmental monitoring and decision support.


Download ppt "Page 1 LAITS Laboratory for Advanced Information Technology and Standards Duh 7/10/03 The GMU Geospatial Grid Technology Development and Application Project."

Similar presentations


Ads by Google