Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Cyberinfrastructure Framework for Discovery, Integration, and Analysis of Earth Science Data A Prototype System A. K. Sinha, Z. Malik, A. Rezgui, A.

Similar presentations


Presentation on theme: "A Cyberinfrastructure Framework for Discovery, Integration, and Analysis of Earth Science Data A Prototype System A. K. Sinha, Z. Malik, A. Rezgui, A."— Presentation transcript:

1 A Cyberinfrastructure Framework for Discovery, Integration, and Analysis of Earth Science Data A Prototype System A. K. Sinha, Z. Malik, A. Rezgui, A. Dalton, K. Lin * Virginia Tech ** San Diego Supercomputer Center ** * ***

2 2 Hypothesis Evaluation : Are A-Type Rocks in Virginia related to a Hot Spot Trace ?

3 3 GEON’s DIA Engine Evaluating a Hypothesis requires Discovery - Access to Data Integration of Data – Provide data products Analysis of Data – Verify Hypothesis

4 4 Data Discovery Registration of Data : Pre-requisite for Data Discovery Level 1 Registration – Keywords Level 2 Registration – Ontologic Classes Level 3 Registration – Item Detail Level

5 5 Registration of Data: Key to Discovery, Integration and Analysis Level 1 Discovery of data resources (e.g., gravity, geologic maps, etc) requires registration through use of high level index terms. GEON has deployed extension of AGI Index terms -will be cross indexed to others such as GCMD, AGU Level 2 Discovering Item level databases requires registration at data level ontologies (e.g. bulk rock geochemistry, gravity database) Level 3 Item detail level registration (e.g., column in geochemical database that represents SiO2 measurement). This level of registration is a requirement for semantic integration

6 6 AGI Index Terms GEON Index Ontology http://www.geoscienceworld.org/ Level 1 Registration

7 7 Level 2: Registration at the Item Level Mineral Rock Element Isotope Structure Location Level 2 Registration

8 8 10..n A Section from Planetary Material Ontology GEON approach of registering data to concepts removes structural (format) and semantic heterogeneity Level 3 Registration

9 9 DIA Engine (1) How does GEON discover data Keywords, Resource Type, Temporal, Spatial Invoke GEON protocol for discovering databases Discovery, Integration and Analysis Engine Retrieve the discovered data from registered databases Emphasize Geospatial and Aspatial Discoveries (Not all things to be done through a Map-based browser)

10 10 DIA Engine (2) Geospatial EngineAspatial Engine

11 11 High-Level View of the DIA Engine User specifies class of data for analysis The DIA Engine derives and retrieves the different data sets needed for the requested analysis The DIA Engine applies processing and filtering techniques to generate the requested data product Data products and Query Steps can be saved Raw Data Query Tool Data Product Modeling Computation

12 12 Data products (1) Data products can be in the form of Interactive Maps, Interactive Filtering Diagrams or Excel Data Files Examples: A map showing the A-Type bodies in the Mid-Atlantic region An Excel file giving the ages of those A-Type bodies A gravity database table spatially related to A-Type bodies Saved as a contoured gravity map

13 13 Data products (2) Data products can be: Pre-Packaged Quickly queried but not flexible and provide little support for complex scientific discovery Created Dynamically May require on-the-fly, extensive query processing but enables far richer possibilities for scientific discovery Requires Semantic Integration

14 14 Data Integration (1) Semantic integration of data products requires: Ontologies: a common language to interpret data from different sources Data sharing: requires data registration Fine grain (i.e., item-level) registration is necessary to enable the automatic processing (by tools) of shared data.

15 15 Data Integration (2)

16 16 Limitations of Current Data Sharing Approaches Each research group adopts its own acronyms, notations, conventions, units, etc. Data sharing is of limited scope Data discovery is ad-hoc Only a small community of scientists may be aware of and share a given data set Integration is difficult Extensive conversion efforts may be needed Absence of streamlined integration leads to poor ability to answer complex scientific questions Solution: Ontology-based Data Registration

17 17 Menu-based (Used in the Demo) The GUI lets the user select only specific items which in turn queries only a subset of the data A robust system informs the user of any incorrect input and guides in the right direction Results are guaranteed as the query is definitely answered Text-based The entire database can be queried Result sets may be empty Only a small mistake in the query can return incorrect results, without the user being able to point out the fallacy Query Building

18 18 Menu-based Query Building In a selected “region of interest” the user is provided with a number of options (the menu) User clicks through the different menus to build an exact query Click history is maintained to enable future referencing Menu # 1Menu # 3Menu # 2Menu # 4 Menu # 5

19 19 Query Tool Selection Tools provided by GEON can be used to answer a query OR Other geologic tools can be incorporated (invocation interfaces need to be defined) Example: GCD-Kit can be used for classification, geotectonic and normative calculations for Igneous Rocks

20 20 Analysis Data Product(s) generated can be analyzed using various techniques Modeling Computation

21 21 Workflow Associated with the Demo

22 22 Used Technologies User Interface: Java / VB Script ASP.net VB.net Back-End: ESRI ArcGIS Server 9.1 ESRI ArcSDE 9.1 (Spatial Database) Microsoft SQL Server (Geo-Chemical Database) Functionality Coding: Visual Basic (to code the discrimination filters)

23 23 Demo Starts Here

24 24 Current Tool Sharing Approaches Each research group develops its own tools Tools developed by a research group are rarely used by other groups Redundancy of development efforts Little interoperability amongst tools Interaction amongst different tools is often not possible or requires extensive (re)coding Solution: Wrap Tools as Web Services Accessible to the Scientific Community Worldwide

25 25 The Future: Integration through Ontologies and Web Services Benefits of Web Services Facilitate Integration Tools developed independently may easily be integrated into new applications Example: Discrimination tools may be made as Web services Provide High Reusability More tools available to the research community Reduce development time, effort, and cost

26 26 Web Services Explained (1) WS Standards WSDL: Web Services Description Language UDDI: Universal Description, Discovery, and Integration SOAP: Simple Object Access Protocol

27 27 Web Services Explained (2) WSDL (Service provider describes service using WSDL) An XML-based language to describe the capabilities of Web services The capabilities of a WS are described as a set of end points that can exchange messages WSDL is part of UDDI UDDI (Service provider publishes service using UDDI) A Web-based directory where service providers may list their services and where service consumer may retrieve the services published by the providers (like yellow pages) SOAP (Clients and services communicate using SOAP) An XML-based protocol used to encode the messages (requests and responses) exchanged between a Web service and its clients.

28 28


Download ppt "A Cyberinfrastructure Framework for Discovery, Integration, and Analysis of Earth Science Data A Prototype System A. K. Sinha, Z. Malik, A. Rezgui, A."

Similar presentations


Ads by Google