Distributed Data Analysis & Dissemination System (D-DADS ) Special Interest Group on Data Integration June 2000.

Slides:



Advertisements
Similar presentations
Chapter 10: Designing Databases
Advertisements

C6 Databases.
Integrating Biodiversity Data
Management Information Systems, Sixth Edition
OLAP Services Business Intelligence Solutions. Agenda Definition of OLAP Types of OLAP Definition of Cube Definition of DMR Differences between Cube and.
Chapter 3 Database Management
CPSC 695 Future of GIS Marina L. Gavrilova. The future of GIS.
Advanced Topics COMP163: Database Management Systems University of the Pacific December 9, 2008.
Data Sources Data Warehouse Analysis Results Data visualisation Analytical tools OLAP Data Mining Overview of Business Intelligence Data visualisation.
16 months…. The Visibility Information Exchange Web System is a database system and set of online tools originally designed to support the Regional Haze.
Introduction to Building a BI Solution 권오주 OLAPForum
Business Intelligence components Introduction. Microsoft® SQL Server™ 2005 is a complete business intelligence (BI) platform that provides the features,
DASHBOARDS Dashboard provides the managers with exactly the information they need in the correct format at the correct time. BI systems are the foundation.
Data Warehousing: Defined and Its Applications Pete Johnson April 2002.
Copyright © 2014 Pearson Education, Inc. 1 It's what you learn after you know it all that counts. John Wooden Key Terms and Review (Chapter 6) Enhancing.
Distributed Data Analysis & Dissemination System (D-DADS) Prepared by Stefan Falke Rudolf Husar Bret Schichtel June 2000.
Dissemination of Haze Data, Data Products and Information Bret Schichtel, Rodger Ames, Shawn McClure and Doug Fox.
What is Business Intelligence? Business intelligence (BI) –Range of applications, practices, and technologies for the extraction, translation, integration,
Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization.
Delivery of Forecasted Atmospheric Ozone and Dust for a Public Health Decision-Support System-Architecture and Functionality William B. Hudspeth, Jeff.
Management Information Systems, 4 th Edition 1 Chapter 8 Data and Knowledge Management.
Understanding Data Warehousing
Systems analysis and design, 6th edition Dennis, wixom, and roth
Data Management Turban, Aronson, and Liang Decision Support Systems and Intelligent Systems, Seventh Edition.
Ihr Logo Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang.
Information System for Air Quality Management: End-to-End System Architecture November 2001.
Interoperability ERRA System.
Distributed Voyager (DVoy) Web Services
Fundamentals of Information Systems, Fifth Edition
DECISION SUPPORT SYSTEM ARCHITECTURE: The data management component.
An Integration Framework for Sensor Networks and Data Stream Management Systems.
material assembled from the web pages at
OnLine Analytical Processing (OLAP)
Chapter 6 SAS ® OLAP Cube Studio. Section 6.1 SAS OLAP Cube Studio Architecture.
© 2005 Prentice Hall, Decision Support Systems and Intelligent Systems, 7th Edition, Turban, Aronson, and Liang 5-1 Chapter 5 Business Intelligence: Data.
Using SAS® Information Map Studio
Enterprise Reporting Solution
Announcements. Data Management Chapter 12 Traditional File Approach  Structure Field  Record  File  Fixed All records have common fields, and a field.
C6 Databases. 2 Traditional file environment Data Redundancy and Inconsistency: –Data redundancy: The presence of duplicate data in multiple data files.
Ihr Logo Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang.
5 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved.
Carey Probst Technical Director Technology Business Unit - OLAP Oracle Corporation.
INTRODUCTION TO GEOGRAPHICAL INFORMATION SCIENCE RSG620 Week 1, Lecture 2 April 11, 2012 Department of RS and GISc Institute of Space Technology, Karachi.
MANAGING DATA RESOURCES ~ pertemuan 7 ~ Oleh: Ir. Abdul Hayat, MTI.
Data resource management
By N.Gopinath AP/CSE. There are 5 categories of Decision support tools, They are; 1. Reporting 2. Managed Query 3. Executive Information Systems 4. OLAP.
CA-OES CAL(IT)2 Feb. 20, 2002 Internet GIServices
Ayyat IT Group Murad Faridi Roll NO#2492 Muhammad Waqas Roll NO#2803 Salman Raza Roll NO#2473 Junaid Pervaiz Roll NO#2468 Instructor :- “ Madam Sana Saeed”
Management Information Systems, 4 th Edition 1 Chapter 8 Data and Knowledge Management.
DATA RESOURCE MANAGEMENT
© 2003 Prentice Hall, Inc.3-1 Chapter 3 Database Management Information Systems Today Leonard Jessup and Joseph Valacich.
What is OLAP?.
Fire Emissions Network Sept. 4, 2002 A white paper for the development of a NSF Digital Government Program proposal Stefan Falke Washington University.
Web Service in Geographic Information System Bing Wu.
Data Warehousing November PORTALS A portal provides users with personalized, one-stop shopping for structured and unstructured data, as well as.
Application of NASA ESE Data and Tools to Particulate Air Quality Management A proposal to NASA Earth Science REASoN Solicitation CAN-02-OES-01 REASoN:
Federal Land Manager Environmental Database (FED) Overview and Update June 6, 2011 Shawn McClure.
1 Copyright © 2008, Oracle. All rights reserved. Repository Basics.
Managing Data Resources File Organization and databases for business information systems.
Grid Services for Digital Archive Tao-Sheng Chen Academia Sinica Computing Centre
Data Mining & OLAP What is Data Mining? Data Mining is the set of activities used to find new, hidden, or unexpected patterns in data.
Popular Database Management Systems
Flanders Marine Institute (VLIZ)
Overview of LDB Technology and Tools
Federal Land Manager Environmental Database (FED)
MANAGING DATA RESOURCES
MANAGING DATA RESOURCES
TOOLS & Projects overview
Reportnet 3.0 Database Feasibility Study – Approach
Palestinian Central Bureau of Statistics
Presentation transcript:

Distributed Data Analysis & Dissemination System (D-DADS ) Special Interest Group on Data Integration June 2000

Overview Environmental data are collected by multiple, disparate data providers, such as individual EMPACT projects Each data provider presents their data in their own format making it difficult to find, access, read, and integrate the data Standardized formats and data dissemination systems are required for data accessibility and integration of distributed data sets This proposal presents a distributed data analysis and delivery system that provides users with data access to multiple sources

The Data Flow Process: From Raw Data to Refined Knowledge Primary data are gathered from providers of sensory data Data are integrated, filtered, aggregated and fused into secondary data Reports are prepared for delivering environmental knowledge to the public EMPACT

Data Flow Resistances These resistances can be overcome through a distributed system that catalogs and standardizes the data allowing easy access for data manipulation and analysis. The user does not know what data are available The available data are poorly described (metadata) There is a lack of QA/QC information The data come in various formats requiring hand crafted codes to read and manipulate them The data flow process is hampered by a number of resistances.

Interoperability “the ability to freely exchange all kinds of spatial information about the Earth and about objects and phenomena on, above, and below the Earth’s surface; and to cooperatively, over networks, run software capable of manipulating such information.” (Buehler & McKee, 1996) Such a system has two key elements: Exchange of meaningful information Cooperative and distributed data management One requirement for an effective distributed environmental data system is interoperability, defined as,

Distributed Data Analysis & Dissemination System: D-DADS Specifications:  Uses standardized forms of data, metadata and access protocols  Supports distributed data archives, each run by its own provider  Provides tools for data exploration, analysis and presentation Features:  Data are organized as multidimensional data cubes  Dimensional data cubes are distributed but shared  Analysis is supported by built-in and user functions  Supports other data types, such as images, GIS data layers, etc.

D-DADS Architecture

The D-DADS Components Data Providers supply primary data to system, through SQL or other data servers. Standardized Description & Format populate and describe the data cubes and other data types using a standard metadata describing data Data Access and Manipulation tools for providing a unified interface to the data cubes and GIS data layers for accessing and processing (filtering, aggregating, fusing) data and integrating data into virtual data cubes Users are the analysts who access the D-DADS and produce knowledge from the data The multidimensional data access and manipulation component of D-DADS can be implemented using OLAP.

On-line Analytical Processing: OLAP A multidimensional data model making it easy to select, navigate, integrate and explore the data. An analytical query language providing power to filter, aggregate and merge data as well as explore complex data relationships. Ability to create calculated variables from expressions based on other variables in the database. Pre-calculation of frequently queried aggregated values, i.e. monthly averages, enables fast response time to ad hoc queries.

Fast Analysis of Shared Multidimensional Information (FASMI) (Nigel, P. “The OLAP Report”) being Fast – The system is designed to deliver relevant data to users quickly and efficiently; suitable for ‘real-time’ analysis facilitating Analysis – The capability to have users extract not only “raw” data but data that they “calculate” on the fly. being Shared – The data and its access are distributed. being Multidimensional – The key feature. The system provides a multidimensional view of the data. exchanging Information – The ability to disseminate large quantities of various forms of data and information. An OLAP system is characterized as:

Multi-Dimensional Data Cubes Multi-dimensional data models use inherent relationships in data to populate multidimensional matrices called data cubes. A cube's data can be queried using any combination of dimensions Hierarchical data structures are created by aggregating the data along successively larger ranges of a given dimension, e.g time dimension can contain the aggregates year, season, month and day.

User Interaction with D-DADS Query Data View (Table, Map, Time Chart, etc.) Distributed Database XML data

Example Application: Visibility D-DADS Visibility observations (extinction coefficient) are an indicator of air quality and serve as an important data set in the public’s understanding of air quality. A visibility D-DADS will consist of multiple forms of visibility data, such as visual range observations and digital images from web cameras. Potential visibility data providers include: - EMPACT projects and their hourly visual range data - The IMPROVE database - CAPITA, a warehouse for global surface observation data available every six hours

Possible Node in Geography Network National Geographic and ESRI are establishing a geography network consisting of distributed spatial databases. Some EMPACT projects are participating as nodes in the initial start-up phase The visibility distributed data and analysis system could link to and become another node in the geography network, making use of the geography network’s spatial viewers. Other views, such as a time view could be linked with the spatial viewer to take advantage of the multidimensional visibility data cubes.

Example Viewer Map View Variable View Time View WebCam View The views are linked so that making a change in one view, such as selecting a different location in the map view, updates the other views.

Summary In the past, data analysis has been hampered by data flow resistances. Fortunately, the tools and framework to overcome these resistances now exist, including: World Wide Web XML OLAP ArcIMS Metadata standards It appears timely to consider a distributed environmental data analysis and dissemination system.