C van Ingen, D Agarwal, M Goode, J Gupchup, J Hunt, R Leonardson, M Rodriguez, N Li Berkeley Water Center John Hopkins University Lawrence Berkeley Laboratory.

Slides:



Advertisements
Similar presentations
Business Intelligence Simon Pease. Experience with BI Developing end-to-end BI prototype for Plan International Developing end-to-end BI prototype for.
Advertisements

Flux Data Server User Tutorial Deb Agarwal, Catharine van Ingen, Susan Holladay, and Misha Krassovski Berkeley Water Center (UCB, LBL), ORNL, and Microsoft.
Introduction to ETL Using Microsoft Tools By Dr. Gabriel.
HydroServer A Platform for Publishing Space- Time Hydrologic Datasets Support EAR CUAHSI HIS Sharing hydrologic data Jeffery.
ICEWATER: INRA Constellation of Experimental Watersheds Cyberinfrastructure to Support Publication of Water Resources Data Jeffery S. Horsburgh, Utah State.
Linking HIS and GIS How to support the objective, transparent and robust calculation and publication of SWSI? Jeffery S. Horsburgh CUAHSI HIS Sharing hydrologic.
This work is funded by National Science Foundation Grant EAR Accessing and Sharing Data Using the CUAHSI Hydrologic Information System CUAHSI HIS.
Components of an Integrated Environmental Observatory Information System Cyberinfrastructure to Support Publication of Water Resources Data Jeffery S.
ODM present and futures for internal discussion 5 September 2007 Ilya Zaslavsky, Dave Valentine, Tom Whitenack, Catharine van Ingen.
Time Series Analyst An Internet Based Application for Viewing and Analyzing Environmental Time Series Jeffery S. Horsburgh Utah State University David.
Berkeley Water Center Early Experience Prototyping a Science Data Server for Environmental Data Deb Agarwal, LBL Catharine van Ingen,
MS DB Proposal Scott Canaan B. Thomas Golisano College of Computing & Information Sciences.
Development of a Community Hydrologic Information System Jeffery S. Horsburgh Utah State University David G. Tarboton Utah State University.
Integrating Historical and Realtime Monitoring Data into an Internet Based Watershed Information System for the Bear River Basin Jeff Horsburgh David Stevens,
Introducing the CUAHSI Hydrologic Information System Desktop Application (HydroDesktop) and Open Development Community Jiří Kadlec, Daniel Ames, Teva Velupillai.
Deployment and Evaluation of an Observations Data Model Jeffery S Horsburgh David G Tarboton Ilya Zaslavsky David R. Maidment David Valentine
SAN DIEGO SUPERCOMPUTER CENTER Developing a CUAHSI HIS Data Node, as part of Cyberinfrastructure for the Hydrologic Sciences David Valentine Ilya Zaslavsky.
Sensor Data Management with Model-based View LSIR, EPFL.
CSE6011 Warehouse Models & Operators  Data Models  relations  stars & snowflakes  cubes  Operators  slice & dice  roll-up, drill down  pivoting.
DASHBOARDS Dashboard provides the managers with exactly the information they need in the correct format at the correct time. BI systems are the foundation.
Tools for Publishing Environmental Observations on the Internet Justin Berger, Undergraduate Researcher Jeff Horsburgh, Faculty Mentor David Tarboton,
Distributed Data Analysis & Dissemination System (D-DADS) Prepared by Stefan Falke Rudolf Husar Bret Schichtel June 2000.
Deborah Agarwal BWC technical team 16 July Applications of eddy covariance measurements, Part 1: Lecture on Analyzing and Interpreting CO2 Flux.
LÊ QU Ố C HUY ID: QLU OUTLINE  What is data mining ?  Major issues in data mining 2.
SharePoint 2010 Business Intelligence Module 6: Analysis Services.
About CUAHSI The Consortium of Universities for the Advancement of Hydrologic Science, Inc. (CUAHSI) is an organization representing 120+ universities.
DATA WAREHOUSING IN SQL SERVER 2005/2008 BUSINESS INTELLIGENCE.
IST722 Data Warehousing Business Intelligence Development with SQL Server Analysis Services and Excel 2013 Michael A. Fudge, Jr.
About CUAHSI The Consortium of Universities for the Advancement of Hydrologic Science, Inc. (CUAHSI) is an organization representing 120+ universities.
Deborah Agarwal (UCB/LBL) Catharine van Ingen (MSR) Berkeley Water Center 22 October 2007.
Exercises: Organizing, Loading, and Managing Point Observations Using HydroServer Support EAR CUAHSI HIS Sharing hydrologic data
© 2008 Ocean Data Systems Ltd - Do not reproduce without permission - exakom.com creation Dream Report O CEAN D ATA S YSTEMS O CEAN D ATA S YSTEMS The.
CUAHSI, WATERS and HIS by Richard P. Hooper, David G. Tarboton and David R. Maidment.
Deb Agarwal abd Marty Humphrey e Norman Beekwilder e Monte Goode abd
Chapter 6 SAS ® OLAP Cube Studio. Section 6.1 SAS OLAP Cube Studio Architecture.
Using SAS® Information Map Studio
Peter Bajcsy, Rob Kooper, Luigi Marini, Barbara Minsker and Jim Myers National Center for Supercomputing Applications (NCSA) University of Illinois at.
Deb Agarwal (UCB and LBNL) Catharine van Ingen (MSFT) Berkeley Water Center Microsoft TCI IndoFlux Meeting, Chennai, India, July.
CUAHSI HIS Features of Observations Data Model. NWIS ArcGIS Excel NCAR Trends NAWQA Storet NCDC Ameriflux Matlab AccessSAS Fortran Visual Basic C/C++
Abstract Analysis and Visualization of Hydrologic Data and Observations Catalogs Using the OLAP Data Cube Technology Ilya Zaslavsky a, Matthew Rodriguez.
Building Dashboards SharePoint and Business Intelligence.
Abstract OLAP Cube Visualization of Hydrologic Data Catalogs Ilya Zaslavsky a, Matthew Rodriguez a, Bora Beran b, David Valentine a, Jillian Wallis c,
Distributed Data Analysis & Dissemination System (D-DADS ) Special Interest Group on Data Integration June 2000.
BI Practice March-2006 COGNOS 8BI TOOLS COGNOS 8 Framework Manager TATA CONSULTANCY SERVICES SEEPZ, Mumbai.
GIS for Atmospheric Sciences and Hydrology By David R. Maidment University of Texas at Austin National Center for Atmospheric Research, 6 July 2005.
CS 157B: Database Management Systems II April 10 Class Meeting Department of Computer Science San Jose State University Spring 2013 Instructor: Ron Mak.
Deb Agarwal (BWC), Marty Humphrey (Uva), and Norm Beekwilder (Uva)
1 Database Systems, 8 th Edition Star Schema Data modeling technique –Maps multidimensional decision support data into relational database Creates.
Lecture 11 Introduction to R and Accessing USGS Data from Web Services Jeffery S. Horsburgh Hydroinformatics Fall 2013 This work was funded by National.
Copyright © 2016 Pearson Education, Inc. Modern Database Management 12 th Edition Jeff Hoffer, Ramesh Venkataraman, Heikki Topi CHAPTER 11: BIG DATA AND.
Business Intelligence Environment Integration with Dynamics NAV Rogers Family Company Matthew McGinley Devraj Ghosh Dominic Miller.
Hydroinformatics Lecture 15: HydroServer and HydroServer Lite The CUAHSI HIS is Supported by NSF Grant# EAR CUAHSI HIS Sharing hydrologic data.
1 Copyright © 2008, Oracle. All rights reserved. Repository Basics.
UNIVERSITY OF UTAH GREEN INFRASTRUCTURE MONITORING DATABASE CVEEN 7970 Hydroinformatics Semester Project Zachary Magdol, Jai Kanth Panthail, Pratibha Sapkota,
Developing a community hydrologic information system David G Tarboton David R. Maidment (PI) Ilya Zaslavsky Michael Piasecki Jon Goodall
The CUAHSI Hydrologic Information System Spatial Data Publication Platform David Tarboton, Jeff Horsburgh, David Maidment, Dan Ames, Jon Goodall, Richard.
Sharing Hydrologic Data with the CUAHSI* Hydrologic Information System
Data Warehousing/Loading the DW—Topics
Jeffery S. Horsburgh Utah State University
Gabe Cano, Altarum Institute  Mark Perry, Altarum Institute 
Lecture 8 Database Implementation
CUAHSI HIS Sharing hydrologic data
Data Acquisition, Management and Manipulation
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
Data Warehouse.
HydroDesktop: A Key Component of the CUAHSI/CZO HIS for Hydrologic Data Discovery, Visualization, and Analysis Daniel P. Ames, Ph.D. P.E. Idaho State University.
Enhance BI Applications and Simplify Development
Introduction of Week 9 Return assignment 5-2
Point Observations Information Model
Data Warehousing/Loading the DW—Topics
Presentation transcript:

C van Ingen, D Agarwal, M Goode, J Gupchup, J Hunt, R Leonardson, M Rodriguez, N Li Berkeley Water Center John Hopkins University Lawrence Berkeley Laboratory Microsoft Research University of California, Berkeley

Introduction Over the past year, we’ve been exploring how to build and user a digital watershed in the cloud Our focus is enabling end-user analysis Assumes data access will get better (thanks to CUAHSI and others) Bottoms up approach: start with database and build to the tool Just in time approach: build tools to solve science needs In the cloud to free the scientist from any operational issues associated with the technology we use

Hydrologic Data Analysis Pipeline Distributed Data Sets Analysis Gateway Data Gateway Models, Analysis Tools Knowledge discovery, Hypothesis testing, Water Synthesis Dissemination Challenge is to Connect Data, Resources, and People Data Archive Data Transformations

Data Flow Pipeline Agency web site, streaming sensor data, or other source CSV Files BWC SQL Server Database BWC Data Cube Reports, Excel Pivot Table, MatLab, ArcGIS

Key Schema Abstractions Data, ancillary data, and metadata Analyses often require combining time series data with fixed, or nearly fixed ancillary data such as river mile, vegetative cover, sediment grain size Ancillary data used as fixed property, time series, or event time window Metadata describing algorithms, measurement techniques, etc. Normalized table structure simplifies adding variables and cube building Versioning and folder-like collections Accommodate algorithm changes, temporal granularity and derived quantities Track derivations through processing pipeline Define and track analysis “working set” Namespace translation Data assembly traverses different repositories each with own (useful?) name space Some repositories encode metadata in variable name space (eg USGS turbidity) Any access layer shares the same abstractions.

Database Schema Subset Star schema for data similar to CUAHSI ODM Ancillary data shredded like data –Active over a time range –Numeric or text –Flows to the data cube as site attribute or time series data Two level versioning maps to data sourcing –Bound into a dataset version with spline filter –Only the dataset flows to the datacube

Datacube Basics A data cube is a database specifically for data mining (OLAP) Organizes data along dimensions such as time, site, or variable type Easy to group, filter, and aggregate data in a variety of ways Simple aggregations such as sum, min, or max can be pre-computed for speed Additional calculations such as median can be computed dynamically SQL Server Analysis Services (SSAS) provides the OLAP engine SQL Server Business Intelligence Development Studio is used to define and tune Excel and other client tools enable simple browsing Minimizes total software burden writing queries (SQL or MDX) Discharge and Turbidity variability Daily Discharge Availability by Site by Year Each bar is a count of data points color coded by reporting per year The higher the bar, the more reported datal

Learnings and Observations Simplifying data discovery speeds analysis Discovery is a necessary precursor step to analysis What data where when? At what quality? Versioning is critical Site-variable most naturally maps to analysis patterns Dataset too coarse; individual measurement too fine Ancillary data must be versioned as well Matching the scientific notion of time to commercial tools can problematic Second month of water year has 30 days in US MODIS week Granularity widely varying Plan on decode stage for name, location, time, quality Don’t forget historic (non-digital) data