INTEGRATED DATA SYSTEM FOR CRITICAL ZONE OBSERVATORIES Mark Williams, University of Colorado.

Slides:



Advertisements
Similar presentations
Observations Data Model 2.0
Advertisements

GEOSS Water Services for Data and Maps Community Recommendations David K. Arctur, Project Lead, GEOSS AIP-6 Water SBA University.
Some notes on CyberGIS in hydrology Ilya Zaslavsky Spatial Information Systems Lab San Diego Supercomputer Center UCSD TeraGrid CyberGIS Workshop, February.
HydroServer A Platform for Publishing Space- Time Hydrologic Datasets Support EAR CUAHSI HIS Sharing hydrologic data Jeffery.
ICEWATER: INRA Constellation of Experimental Watersheds Cyberinfrastructure to Support Publication of Water Resources Data Jeffery S. Horsburgh, Utah State.
Linking HIS and GIS How to support the objective, transparent and robust calculation and publication of SWSI? Jeffery S. Horsburgh CUAHSI HIS Sharing hydrologic.
This work is funded by National Science Foundation Grant EAR Accessing and Sharing Data Using the CUAHSI Hydrologic Information System CUAHSI HIS.
ODM2: Developing a Community Information Model and Supporting Software to Extend Interoperability of Sensor and Sample Based Earth Observations Jeffery.
Components of an Integrated Environmental Observatory Information System Cyberinfrastructure to Support Publication of Water Resources Data Jeffery S.
Metadata Standards for Sample- Based Observations Kerstin Lehnert EGU General Assembly 2011.
This work was funded by the U.S. National Science Foundation under grant EAR Any opinions, findings and conclusions or recommendations expressed.
Crossing the Digital Divide
HydroServer A Platform for Publishing Space- Time Hydrologic Datasets Support EAR CUAHSI HIS Sharing hydrologic data Jeffery.
DATA SYSTEMS FOR SAMPLE- BASED OBSERVATIONS 1 Kerstin Lehnert.
Development of a Community Hydrologic Information System Jeffery S. Horsburgh Utah State University David G. Tarboton Utah State University.
Integrating Historical and Realtime Monitoring Data into an Internet Based Watershed Information System for the Bear River Basin Jeff Horsburgh David Stevens,
Introducing the CUAHSI Hydrologic Information System Desktop Application (HydroDesktop) and Open Development Community Jiří Kadlec, Daniel Ames, Teva Velupillai.
Deployment and Evaluation of an Observations Data Model Jeffery S Horsburgh David G Tarboton Ilya Zaslavsky David R. Maidment David Valentine
CZO Integrated Data Management Web services, CZO data publication system prototype, demo Ilya Zaslavsky SDSC.
An End-to-End System for Publishing Environmental Observations Data Jeffery S. Horsburgh David K. Stevens, David G. Tarboton, Nancy O. Mesner, Amber Spackman.
Tools for Publishing Environmental Observations on the Internet Justin Berger, Undergraduate Researcher Jeff Horsburgh, Faculty Mentor David Tarboton,
Over-allocation to irrigation Bushfire recovery impacts Expanding plantations Drying and warming climate Uncapped groundwater extraction Expanding farm.
GEOSS Common Infrastructure: A practical tour Doug Nebert U.S. Geological Survey September 2008.
Metadata (for the data users downstream) RFC GIS Workshop July 2007 NOAA/NESDIS/NGDC Documentation.
About CUAHSI The Consortium of Universities for the Advancement of Hydrologic Science, Inc. (CUAHSI) is an organization representing 120+ universities.
© 2013 National Ecological Observatory Network, Inc. ALL RIGHTS RESERVED. THE NEON APPROACH TO DATA INGEST, CURATION, AND SHARING Christine Laney (Data.
Information Requirements for Integrating Spatially Discrete, Feature- Based Earth Observations Jeffery S. Horsburgh Anthony Aufdenkampe, Kerstin Lehnert,
About CUAHSI The Consortium of Universities for the Advancement of Hydrologic Science, Inc. (CUAHSI) is an organization representing 120+ universities.
Crossing the Digital Divide Presented by: Fernando R. Salas David Maidment, Enrico Boldrini, Stefano Nativi, Ben Domenico OGC Technical Meeting – Met/Occean.
Abstract Building an integrated information system for publishing heterogeneous Critical Zone Observatory data Thomas Whitenack 1, Mark Williams 2, David.
AON Data Questionnaire Results 21 Respondents Last Updated 27 March 2007 First AON PI Meeting Scot Loehrer, Jim Moore.
, Increasing Discoverability and Accessibility of NASA Atmospheric Science Data Center (ASDC) Data Products with GIS Technology ASDC Introduction The Atmospheric.
, Implementing GIS for Expanded Data Accessibility and Discoverability ASDC Introduction The Atmospheric Science Data Center (ASDC) at NASA Langley Research.
Water Web Services David R. Maidment Center for Research in Water Resources University of Texas at Austin Open Waters Symposium Delft, the Netherlands.
Data Interoperability in the Hydrologic Sciences The CUAHSI Hydrologic Information System David Tarboton, David Maidment, Ilya Zaslavsky, Dan Ames, Jon.
Advancing an Information Model for Environmental Observations Jeffery S. Horsburgh Anthony Aufdenkampe, Richard P. Hooper, Kerstin Lehnert, Kim Schreuders,
Publishing Observations Data: from ODM to HIS Central.
material assembled from the web pages at
Introduction to Apache OODT Yang Li Mar 9, What is OODT Object Oriented Data Technology Science data management Archiving Systems that span scientific.
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
BioData a new bioassessment database for the USGS Briefing for the CDI
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
EarthCube Building Block for Integrating Discrete and Continuous Data (DisConBB) David Maidment, University of Texas at Austin (Lead PI) Alva Couch, Tufts.
CBSOR,Indian Statistical Institute 30th March 07, ISI,Kokata 1 Digital Repository support for Consortium Dr. Devika P. Madalli Documentation Research &
CUAHSI HIS Features of Observations Data Model. NWIS ArcGIS Excel NCAR Trends NAWQA Storet NCDC Ameriflux Matlab AccessSAS Fortran Visual Basic C/C++
GEON2 and OpenEarth Framework (OEF) Bradley Wallet School of Geology and Geophysics, University of Oklahoma
The Long Tail of Sample-based Data in the Next Decade FROM DARKNESS TO LIGHT Kerstin Lehnert
Critical Zone Observatory Data Discovery Each CZO maintains its own data management system(s) using the data formats it prefers The three CZO’s have a.
Mercury – A Service Oriented Web-based system for finding and retrieving Biogeochemical, Ecological and other land- based data National Aeronautics and.
Exporting WaterML from the Earth System Modeling Framework Xinqi Wang Louisiana State University NCAR SIParCS Program August 4, 2009.
The US Long Term Ecological Research (LTER) Network: Site and Network Level Information Management Kristin Vanderbilt Department of Biology University.
Sharing SRP Water Sample Data Using CUAHSI HIS Infrastructure Ilya Zaslavsky, Thomas Whitenack, Keith Pezzoli, Hiram Sarabia University of California at.
The CUAHSI Observations Data Model Jeff Horsburgh David Maidment, David Tarboton, Ilya Zaslavsky, Michael Piasecki, Jon Goodall, David Valentine,
U.S. Environmental Protection Agency Central Data Exchange Pilot Project Promoting Geospatial Data Exchange Between EPA and State Partners. April 25, 2007.
CUAHSI HIS: Science Challenges Linking small integrated research sites (
Glossary WMS – OGC Web Mapping Services WFS – OGC Web Feature Services XML- Extensible Markup Language OGC – Open GIS Consortium ADN –
The Earth Information Exchange. Portal Structure Portal Functions/Capabilities Portal Content ESIP Portal and Geospatial One-Stop ESIP Portal and NOAA.
Ideas on Opening Up GEOSS Architecture and Extending AIP-5 Wim Hugo SAEON.
Hydroinformatics Lecture 15: HydroServer and HydroServer Lite The CUAHSI HIS is Supported by NSF Grant# EAR CUAHSI HIS Sharing hydrologic data.
CEOS Working Group on Information System and Services (WGISS) Data Access Infrastructure and Interoperability Standards Andrew Mitchell - NASA Goddard.
Developing a community hydrologic information system David G Tarboton David R. Maidment (PI) Ilya Zaslavsky Michael Piasecki Jon Goodall
The CUAHSI Hydrologic Information System Spatial Data Publication Platform David Tarboton, Jeff Horsburgh, David Maidment, Dan Ames, Jon Goodall, Richard.
Sharing Hydrologic Data with the CUAHSI* Hydrologic Information System
The CUAHSI Hydrologic Information System and NHD Plus A Services Oriented Architecture for Water Resources Data David G Tarboton David R. Maidment (PI)
Lecture 8 Database Implementation
Flanders Marine Institute (VLIZ)
CUAHSI HIS Sharing hydrologic data
HydroDesktop: A Key Component of the CUAHSI/CZO HIS for Hydrologic Data Discovery, Visualization, and Analysis Daniel P. Ames, Ph.D. P.E. Idaho State University.
KISTERS TimeSeries HUB
WGISS Connected Data Assets Oct 24, 2018 Yonsook Enloe
Presentation transcript:

INTEGRATED DATA SYSTEM FOR CRITICAL ZONE OBSERVATORIES Mark Williams, University of Colorado

The water information value ladder Monitoring Collation Quality assurance Aggregation Analysis Reporting Forecasting Distribution Done poorly Done poorly to moderately Sometimes done well, by many groups, but could be vastly improved >>> Increasing value >>> Integration Data >>> Information >>> Insight Slide Courtesy CSIRO, BOM, WMO, Ilya, Dozier

Provenance and transparency

CZOs as platforms for research Integrating satellite & ground measurements with modeling CZO measurements provide the basis for advances in multiple Earth sciences CZOs are DATA-RICH places to develop & test Earth system models

Challenges to CZO Data Management Atmosphere Biosphere Hydrosphere Lithosphere Many Object & Data Types! Diverse media Sensor-based Stationary Mobile Spectra/photos Sample-based Sub-samples Preparations/Fractions Numeric & Categorical Hillslope Catchment Watershed Minutes Decades Millenia Eons

Sample Fractions for Soil Geochemistry Adapting SESAR IGSN for CZO EA-IRMS FTIR SA EA-IRMS FTIR EA-IRMS FTIR Ziplock (~500g) Bulk soil horizon or depth increment Al Can (~70 g) For Gamma Counting 137Cs DRY SIEVE 2 mm glass vial: <2mm fines dry sieved (1) Pick out plant roots & detritus, rinse with DI water, oven dry, mill (SPEX?) >2mm: glass vial: plant detritus milled (2) Remaining pebbles & rocks, hard grind glass vial: pebbles hard ground <2mm ICP-MS after Li-borate fusion XRD? WET SIEVE, or DENSITY, or SETTLING (with or without sonication) glass vial: sand + small detritus glass vial: silt + clay The choice here is important. Do we want aggregates or not? EA-IRMS FTIR ICP-MS after Li-borate fusion XRD CEC SPEX mill EA-IRMS FTIR ICP-MS after Li-borate fusion SPEX mill SA XRD CEC SA Extractions Dithionite-Citrate extraction Na pyrophosphate extraction Ammonium oxalate extraction Christiana River CZO example

Overall Approach Do not reinvent the wheel! Build on – CUAHSI HIS, EarthChemDB, LTER, etc Consistent data presentation on web – Metadata – Data values Central data system for data discovery – Harvested by SDSC (pull system)

CZO data principles and policies Each CZO will operate and be responsible for its own local data management system for collecting, organizing, quality controlling and publishing data through its web site. – Different philosophy than CUAHSI ODM – Each CZO is master of it’s own data We don’t care what goes on under the hood Each site uses it’s own protocols, data bases, etc Allows CZO to honor site legacy data

CZO data principles and policies Each CZO publish’s its data on the web in ascii format with sufficient metadata so that the data can be unambiguously interpreted Metadata follows a proscribed format – Data managers just need rules to follow Easy to harvest by central portal Makes it simple at the site level so scientists comply – Addresses the chokepoint that is getting data/metadata from the scientists to data managers

Data Management Team David Tarboton, Utah State. PI on the CUAHSI Hydrologic Information System (HIS) Kerstin Lehnert, Columbia. PI on EarthChemDB Ilya Zaslavsky, Lead, SDSC Spatial Information Systems Lab; hosts CUAHSI HIS. Mark Williams, CU-Boulder. PI Niwot Ridge LTER Anthony Aufdenkampe, co-I Christiana River Basin CZO

Integrated CZO data system Synthesizing information management experience and software from CZO partners and neighboring earth science projects into a standards-based system for publishing environmental data to emphasize the “critical zone” nature of our shared data sets

Local CZO DB CZO Data Publication System Spatial, hydrologic, geophysical, geochemical, imagery, spectral… Local CZO DB Web site Standard CZO Services Shared vocabularies CZO Metadata Ontology Archive Harvester Standard CZO data display formats CZO Desktop Matlab R Excel ArcGIS Modeling CZO Desktop Applications CZO Data Products CZO Web-based Data Discovery System External cross- project registries DataNet, NEON CZO Data Repository and Indexing (CZO Central)

Data Publication Process (for hydrologic time series) CZO Display File ODM WaterML Service OGC WFS Service Raw Display file metadata Is registered with the CZO data portal, to assure original data is discoverable and downloadable. WFS Service Is registered with the CZO data portal CZO Central Catalog OGC CSW Service CZO Portal utilizes the OGC CSW (catalog services for the web) Catalog Search Service CZO Desktop Broader internet community accessing data using standard protocols.

CZO data interoperability: what does it mean  Find and download CZO resources: files and file collections, services, documents – organized by CZO thematic category and by type  Data available in compatible semantics: ontologies, controlled vocabularies  Data available via the same service interfaces (e.g. WFS, SOS) but different information models  Compatibility at the level of domain information models and databases Deeper integration Wider variety of data Well-understood data with formal information models available via standard services Different types of data collected by CZOs Data discovery portal Shared vocabularies and ontology management Service administration (CZOCentral) CZO desktop, others System components Levels of interoperability

Data disclaimer

Data Catalogue Biogeochemistry: Including: anything on (Carbon), N (Nitrogen), P (Phosphorus) nutrients, microbes Climatology/Meteorology: Including: Met tower, temps, snow Ecology/Biology: Including: microbial, land use Geology/Chronology: Including: geologic, descriptions of rocks-mineralogy, CRN ages/rates Geomorphology: Including: topography, chronological data, sediment flux, fracture space Geophysics: Including: seismic refraction etc Geospatial: Including: GIS/RS, imagery, geologic map, Gordon Gulch and GLV camera's

Water Chemistry Header group (/doc): -Title, Abstract, Investigator, Variable names, Keywords, Methods, Instrument, Citation, Publications, Comments Header group, column information – COL1. Label=ValueAttribue, value=site – COL2. label=ValueAttribute, value=DateTime, UTCOffset=-7, Timezone=MST, format=”YYYYMMDD hh:mm” – COL3. label=ValueAttribute, value=pH, units=pH, SampleMedium=water, units=pH units, missing value indicator=,,methods=method1, etc Header group, column (series) defaults that apply to all columns (eg site below) Data (/data) GREENLAKE4,820311,6.4,18,88.51,0.40,,114.77,24.68,21.75,10.23,25.389,,58.296, ,,,,,,,,,,,,,,,,,, GREENLAKE4,820422,5.7,18,90.15,2.00,,99.80,24.68,17.40,12.79,9.591,,72.870, ,,,,,,,,,,,,,,,,,, Automatically harvested using WaterML and EML ASCII format, metadata and comma-deliminated data

CZO Data Management Web Administration Interface CZO data managers use this web-based system to register display files, edit service metadata, initiate data retrieval, validate the data against shared vocabularies, and update hydrologic time series services The administration system will be extended to geochemical samples and other data

Services edited and validated by CZO data managers Data managers control how their data is annotated. Ingesting of Display files is triggered on the server by the Data manager. Display file ingestion log Editable service definitions and management interface for each CZO data service

CZO Central Catalog Statistics, March 24, 2011 (time series services only) CZO ServiceSitesVariablesValues Jemez River Boulder Creek Santa Catalina Luquillo Southern Sierra Shale Hills Christina River Total:

New Development: Central CZO Data Discovery Portal Registered data are organized by CZO thematic categories

Display files from CZO web sites are registered to the data discovery portal automatically In addition, display files of known types are expressed as data services, which are also registered in the portal The portal is CSW- compliant (CSW=Catalog Services for the Web): can be federated with other catalogs including data.gov Supports search by location, resource type, thematic category, keywords, plus full-text abstract search Federation with CUAHSI HydroCatalog, to allow search of hydrologic data from ~70 networks

Local CZO DB Shared Vocabulary Spatial, hydrologic, geophysical, geochemical, imagery, spectral… Local CZO DB Web site Shared Vocabulary Shared vocabularies CZO Metadata Ontology Archive Harvester Standard CZO data display formats CZO Desktop Matlab R Excel ArcGIS Modeling CZO Desktop Applications CZO Data Products CZO Web-based Data Discovery System External cross- project registries DataNet CZO Data Repository and Indexing (CZO Central)

CZO Shared Vocabulary System Purpose: To promote the consistent use of terminology. Builds on CUAHSI HIS

SV Database Data Managers and SV Data Managers ❶ ❷ CSV Data File Unknown Term Local CZO Website Observation Database CSV Data File ❸ Request Term Web Page XML SV List

Preferred vocabularies. Moderators to be designated by CZO with expertise in each category Variable names (extended from CUAHSI HIS) Units (extended from CUAHSI HIS) (e.g. m, g/L) Value type (from CUAHSI HIS) (e.g. Field observation, derived value, model output) Sample type (from CUAHSI HIS) (e.g. stream water, ground water, rock, soil) Data type (from CUAHSI HIS) (e.g. average over interval, cumulative, continuous, sporadic) Data level (based on Ameriflux list) (e.g. level 0=raw data, level 4 = fully infilled and quality controlled) Spatial references ( extensible based on EPSG) (e.g. NAD 1983, WGS84, UTM zone 11) KEY KEY: CZO expands ODM controlled vocabularies to a larger audience using “preferred vocabularies”

Methods 1. Major problem for metadata 2. Solution: lookup table that is part of the controlled vocabulary 3. Three parts: sample collection, sample preparation, analytical procedure 4. Up and running, needs moderators

Local CZO DB CZO Spatial Data Spatial, hydrologic, geophysical, geochemical, imagery, spectral… Local CZO DB Web site Spatial Data Shared vocabularies CZO Metadata Ontology Archive Harvester Standard CZO data display formats CZO Desktop Matlab R Excel ArcGIS Modeling CZO Desktop Applications Standard CZO Services CZO Web-based Data Discovery System CZO Data Repository and Indexing (CZO Central)

Metadata and Spatial View Spatial View Metadata -Multi File control Spatial Extent -Ex: LiDAR flights, transects, etc. -Point data (collected at particular location). -Uses Google Maps API -KML functionality Guo lab, UC Merced

CZO Desktop Matlab R Excel ArcGIS Modeling Local CZO DB Geochemical Samples (based on CZEN) Geochemical samples Local CZO DB Web site Geochemical web services, EarthChemDB Shared vocabularies Metadata IGSN management Archive Harvester Standard CZO data display formats CZO Desktop Applications Depth- resolved geochemistry CZO Web-based Geochemical DB EarthChem Data Engine & Portal

Location (Watershed) Sampling Site (Soil / Water) AnalysisSample (Layer/Depth) Preparat. /Treatmen t Sub-smpl 2 Sub- sample Sub-smpl n Chemical Phys. Minr Others Data Loc_info /Climate Methods Sources Precision Var-Lookup /Unit Meta-Data Main Data Geo-Info Publication Project SMPL Time Series Landuse /Veg. Lab-Info Person contributor Preparation /Treatment Sample Country /State Lab Analysis Sub- Sample CZO Chemistry Database Conceptual Model – ( CZO CHEM DB ) Penn State lead

 Progress  Database is accessible at  PSU CZO students and post-docs have used template for data entry  Susan Melzar (Colorado State) has used template and data has been entered into database  Published data from Muhs et al. (2001), Harden 1987, White et al. (2008)  Current version contains 1391 records, representing 17,604 data values  Ran webinar August 24 th to show database capabilities and usage of data entry template  15 participated with representation from all 6 CZO’s  User guide is in progress

EarthChem XML DB Metadata catalog datasets (original data & derived products) datasets (original data & derived products) GCDM DB Integration with EarthChemDB 35 USGS NAVDAT GEOROC EarthChem Portal GfG Data Entry User Submission External Databases Topical Data Collections Geochemical Resource Library Kerstin Lehnert

EarthChem Portal 36 PetDB Others USGS GEORO C NAVDA T EarthChem Data Engine Database EarthChem Data Engine Database XML EarthChem Data Engine Search & Visualization EarthChem Data Engine Search & Visualization Partner databases encode their data & metadata in XML and send them to the EarthChem portal database in Kansas. Queries submitted at the EarthChem portal search the contents of the EarthChem Portal Database. Similar to our ODM hydrology portal

INTERNATIONAL GEOSAMPLE NUMBER Purpose: Unique identification for samples and related sampling features in the Earth Sciences – To allow unambiguous referencing of data to samples in publications and data systems – To allow tracking samples through repositories & labs – To allow integration of distributed data for samples D3-1

Geoinformatics for Geochemistry Core Section 1 Core Section 3 Core Section 2 Sample 1 Sample 2 Sample 1 Sample 2 Sample 3 Sample 1 Sample 2 Sample 3 Rock powder Mineral conc. Leachate Fossil separate Microprobe mount Parent Child Parent IGSN:XXX IGSN:XXX0065B3 IGSN:XXX9K23G6 IGSN:XXX07ST4K IGSN:XYZ0G693M IGSN:ABC0L98SW IGSN:ABC0L53NW IGSN:ABC0L653X IGSN:ABC078HGB

IGSN International Organization IGSN International Organization SESAR Near Space Observatory (invented example) Near Space Observatory (invented example) ExoPlanet (invented example) ExoPlanet (invented example) CZO Geoscience Australia USGS IEDA ICDP Repository Analytical Lab Investigator Registrar Registration Agents: Registrants: Managing Agent:

ADAPTING IGSN for CZO Register any type of sample: pedons, hand specimens, mineral concentrates, etc. … Register any type of material: soil, rock, sediment, fluid, gas, bio …. Register ‘sample-related features’: sites, wells, cores, dredges … Register relations (parent – children): e.g. site  pedon  mineral

Exploring A More General Data Model: ODM 2.0 To achieve interoperability between EarthCHEM, CUAHSI ODM, LTER EML Better support for samples and unique identifiers (IGSN/SESAR) Extensibility to table attributes Better annotation and provenance Enable integrated web service based publication of a broader class of CZO data

ODM 2.0 – Field Sensor Extension to support field sensor deployments and in situ observations Sensor deployment details Attributes of sensor Data series from sensor

ODM 2.0 – Provenance and Annotations Extensions Better support for storing provenance of observational data

General Extensibility Provides capability to record information (add fields) in tables that was not anticipated a-priori

CZchemDB CZO-Central GeoChemDB [ODM 2.0] GeoChemDB [ODM 2.0] CZO-Services EarthChem Portal USGS NAVDAT GEOROC Geochemical database EarthChemXML CZO Data Display Format Geochem Services (IEDA) CZO Web Discovery GeoChemDB Search Web-based User Access CZO Desktop GfG Data Validation & Ingest IEDA Long-Term Archiving Service IEDA Data Publication Service (DataCite) SESAR Sample Registration EarthChemXML Other client systems

Where we are today Each site has a data manager Data sets are posted to the web – consistent metadata and ascii format in progress We’ve prototyped harvesting data and posting to a central data portal Shared vocabulary system in place Developed protocol for unique sample ID Partnering with EarthChemDB Expanding ODM to become more general Way beyond what I thought possible

Work plan for next two years Extending the CZO data publication model to geochemical and GIS data; then to other types of data – towards deeper interoperability Integration based on service and information model standards (WaterML, EarthChemXML, EML, OGC services) – Requirements gathering from all CZOs, data modeling, display file format specification, services specification, development and validation – Upgrade to WaterML 2 once approved as international standard (~Q3, 2011) Registering more hydrologic time series data via CZO Central – Regularly harvesting registered files and updating CZO services; keeping provenance information Enhancing parameter-based search across CZOs, with a shared parameter ontology Making CZO central data system more robust – Currently a single server with 24/7 monitoring; need redundant setup Enhancing role of Data Managers