Presentation is loading. Please wait.

Presentation is loading. Please wait.

Voyager Data Services Services for Finding, Exploring and Presenting Distributed Environmental Data Outline Prepared by Voyager Interest Group on Environmental.

Similar presentations


Presentation on theme: "Voyager Data Services Services for Finding, Exploring and Presenting Distributed Environmental Data Outline Prepared by Voyager Interest Group on Environmental."— Presentation transcript:

1 Voyager Data Services Services for Finding, Exploring and Presenting Distributed Environmental Data Outline Prepared by Voyager Interest Group on Environmental Data Integration June 2002 Coordinated by CAPITA Supported by NSF, EPA and NOAA…..

2 Processes of the Information Value Chain Informing Knowledge ActionProductive Knowledge Information Organizing Grouping Classifying Formatting Displaying Analyzing Separating Evaluating Interpreting Synthesizing Judging Options Quality Advantages Disadvantages Deciding Matching goals, Compromising Bargaining Deciding CIRA VIEWSLangley IDEAAQ ManagerWG Summary Rpt Data (after Taylor, 1975) Examples:

3 Data Acquisition and Usage Value Chain Monitor Store Data 1 Monitor Store Data 2 Monitor Store Data n Virtual Int. Data Integrated Data 1 Integrated Data 2 Integrated Data n

4 Data Flow from Providers to AQ Management Environmental data arise from diverse sources, each having specific history, driving forces, delivery formats etc. Data analysis, i.e. turning the raw data into ‘actionable’ knowledge, requires combining data from these sources EPA Networks IMPROVE Visibility Satellites …… Surface Obs. Upper Air Satellites Forecasts …. National Inv. Local Invent. Satellite Fire… Aerosol Data Meteorological Emissions Data Status and Trends AQ Compliance Exposure Assess. Network Assess. Tracking Progress AQ Managm. Reports ‘Knowledge’ from Data Primary Data Diverse Providers Derived Data Filtered, Aggregated, Fused

5 Problems of Current Client-Server Architecture User Tasks: Fi nd servers Reformat data Custom process Server Client-Server Architecture: User accesses, processes and integrates data by customized tools User Problems in combining data: –Legacy data systems can not be altered to support integration –Data systems use different terms or meaning of similar terms –Some data sources, do not have a schema nor formal access methods User Carries the Burden In current client-server architectures the user carries much of the burden of integration.

6 Mediator-Based Integration Architecture (Wiederhold, 1992) Software agents (mediators) can perform many of the data integration chores Heterogeneous sources are wrapped by translation software local to global language Mediators (web services) obtain data from wrappers or other mediators and pass it on … Wrappers remove technical, while mediators resolve the logical heterogeneity The job of the mediator is to provide an answer to a user query (Ullman, 1997)Ullman, 1997 In database theory sense, a mediator is a view of the data found in one or more sources Wrapper Service User Query View Busse et. al, 1999

7 Value-Added Processing in Service Oriented Architecture Control Data Chain 1 Chain 2 Chain 3 Peer-to-peer network representation Data Service Catalog User Data, services and users are distributed throughout the network Users compose data processing chains form reusable services Intermediate data are also exposed for possible further use Chains can be linked to form compound value-adding processes Service chain representation User Tasks: Fi nd data and services Compose service chains Expose output Chain 2 Chain 1 Chain 3 Data Service User Carries less Burden In service-oriented peer-to peer architecture, the user is aided by software ‘agents’

8 Peer-to-Peer Computing (P2P) or Service-Oriented Architecture (??) P2P is an architectural principle based on decentralization and resource sharing It is replacing the current paradigm of client-server computing Data are passed directly from peer-to-peer, rather than through ‘data integrator’ servers The contribution of database community to P2P is the introduction of schemas DATAFED uses the multidimensional data cube as the global schema P2P was made popular by Napster; now it is an intense academic research topic Music files were all uniformly formatted MP3 files; science data need more descriptors Hence the challenge of scientific data sharing – even in P2P environment

9 4 D Geo-Environmental Data Cube (X, Y, Z, T) Environmental data represent measurements in the physical world which has space (X, Y, Z) and time (T) as its dimensions. The specific inherent dimensions for geo- environmental data are: Longitude X, Latitude Y, Elevation Z and DateTime T. Additional dimensions may include parameters, pollutant source, etc. The needs for finding, sharing and integration of geo-environmental data requires that data are coded in this multidimensional data space

10 4+ Dimensional Geo-Environmental Data Space

11 Environmental Data: Multi-Dimensional Data can be distributed over 1,2, …n dimensions 1 Dimensional e.g. Time dimension i j k j i Data Granule i 1 Dimensional e.g. Location & Time 1 Dimensional e.g. Location, Time & Parameter View 1 Data Space View 2 Views are generally orthogonal slices through multidimensional data cubes Spatial and temporal slices through the data space are most common

12 Common Views (slices) through 4D Data Space XY MAP: Z,T fixed Vertical Profile:XYT fixed Time Chart: X,Y,Z fixed Vertical Cross sect: YT fixedVertical Cross sect: XT fixed Vertical Profile Trend: X,Y fixed

13 Generic Data Flow in DATAFED Services DataView 1 DataProcessed Data Portrayed Data Process Data Portrayal/ Render Abstract Data Access View Wrapper Physical Data Abstract Data Physical Data Resides in autonomous servers; accessed by view- specific wrappers which yield abstract data ‘slices’ Abstract Data Abstract data slices are requested by viewers; uniform data are delivered by wrapper services DataView 2 DataView 3 View Data Processed and portrayed data are delivered to the user for building multi-layer views and web applications

14 Render Service Chaining in Spatio-Temporal Data Browser Spatial Slice Find/Bind Data nDim Data Cube Time Slice Time Portrayal Spatial PortrayalSpatial Overlay Time Overlay OGC-Compliant GIS Services Time-Series Services PortrayOverlay Homogenizer Catalog Wrapper Mediator Client Browser Cursor/Controller Maintain Data Vector GIS Data XDim Data SQL Table OLAP Satellite Images Data Sources

15 Overlay of multiple Datasets Each DataCube may have 0-n dimensions Each dimension is assigned a view 3 D DataCube 2 D DataCube DataView 3 Layer 2 Layer 1 DataView 1 DataView 2 In a view, the number of layers is the number of datasets If a DataCube does not have a data for a view, a Null Layer is assigned Null Layer

16 Overlay of multiple Datasets Each DataCube may have 0-n dimensions Each dimension is assigned a view DataView 3 DataView 1 DataView 2 In a view, the number of layers is the number of datasets If a DataCube does not have a data for a view, a Null Layer is assigned 3 D DataCube Data Access Connections Data Render Connections

17 Explain Wrappers

18 An Application Program: Voyager Data Browser The web-programs consists of a stable core and adoptive input/output layers The core maintains the state and executes the data selection, access and render services The adoptive, abstract I/O layers connects the core to evolving web data, flexible displays and to the a configurable user interface: –Wrappers encapsulate the heterogeneous external data sources and homogenize the access –Device Drivers translate generic, abstract graphic objects to specific devices and formats –Ports connect the internal parameters of the program to external controls –WDSL web service description documents Data Sources Controls Displays I/O Layer Device Drivers Wrappers App State Data Flow Interpreter Core Web Services WSDL Ports

19 Publish, Find, Bind Voyager Data Services Architecture Scatter Chart Text, Table Data View & Process Layered Map Cursor Homogeneous data used by Viewing and Processing services in ‘agile’ applications Heterogeneous data: different types, coding and access protocols Connects providers with users and transforms heterogeneous into homogeneous data Time Chart ProvidersUsers XML Web Services Satellite Vector GIS Data XDim Data OLAP Cube SQL Table HTTP, FTP Web Text OpenGIS Services Images Access Services Data & Tool Catalog Uniform Access/Retrieval

20 VOYAGER Web Services Layered Map Time Chart Vector GIS Data XDim Data SQL Tables Web Images Publish, Find, Bind Catalog, Data & Tools Uniform Access Scatter Chart S u p p o r tCoordination T e c h n o l o g i e s Users Select, Overlay, Explore; Multidimensional data Providers Maintain distributed data; Heterogeneous coding, access Voyager Web Services Homogenize data access Catalog, access, transform data C O M M U N I T Y

21 Browser Architecture (0312) View State MapView[1] Services Servicetype1[1] Servicetype2[1] Servicetype2[2] FlowProgram Controllers TimeView[1] Services FlowProgram Controllers MapView[1] TimeView [1] Controllers

22 Distributed Data Processing and Integration User Tasks: Fi nd servers Reformat data Custom process Server Client-Server Architecture: User accesses, processes and integrates data by customized tools User Control Data

23 DataFed Topology: Mediated Peer-to-Peer Network, MP2P MediatorPeers Mediated Peer-to Peer Network Broker maintains a catalog of accessible resources Peers find data and access instructions in the catalog Peers get resources directly from peer providers Google Example: Finding Images on Network Topology Google catalogs the images related to Network Topology User selects an image from the cached image catalogimage catalog User visits the provider web-page where the image residesweb-page SourceSource: Federal Standard 1037C

24 DataFed Topology: Mediated Peer-to-Peer Network, MP2P (Pretentious?) Google-DataFed Comparison Mediated Peer-to Peer Network DataFedGoogleSimilarityDifference Catalogs structured data Catalogs structured data from numerous sources e.g. daily TOMS satellite dataTOMS satellite Catalogs the WEB on data related to a topics, e.g. Network Topology img, HTMLimgHTML Data are cataloged, essence summarized and linked to provider DataFed catalogs only a subset of structured data; Google catalogs the WEB User selects data from the DataFed catalog DataFed catalog User selects an image from the cached image catalogimage catalog (some cashed) User can (1) display raw data (2) use generic viewer (3) custom view (4) XML Viewraw datageneric viewer custom view XML View User visits the provider web- page where the image residesweb- page

25 P2P Network

26 Distributed Data Processing and Integration Service Oriented Architecture: Users compose reconfigurable data processing chains form reusable services User Tasks: Fi nd data and services Compose service chains Expose output DataServices Catalog User User-composed service chains Control Data User Broker Peers Brokered Peer-to Peer Network

27 Data Focus Range Rendering Cursor Viewer Layers Dim1: Lon Dim2: Lat Data provided by each dimension of a View: Dim1.Type, Dim1.Min, Dim1.Max Dim2.Type, Dim2.Min, Dim2.Max …. Current Dim.Types: Latitude, Longitude, DateTime, Elevation


Download ppt "Voyager Data Services Services for Finding, Exploring and Presenting Distributed Environmental Data Outline Prepared by Voyager Interest Group on Environmental."

Similar presentations


Ads by Google