Presentation is loading. Please wait.

Presentation is loading. Please wait.

How to share and publish your data using HIS David G Tarboton Jeff Horsburgh Ilya Zaslavsky Tom Whitenack David Valentine Support EAR 0622374

Similar presentations


Presentation on theme: "How to share and publish your data using HIS David G Tarboton Jeff Horsburgh Ilya Zaslavsky Tom Whitenack David Valentine Support EAR 0622374"— Presentation transcript:

1 How to share and publish your data using HIS David G Tarboton Jeff Horsburgh Ilya Zaslavsky Tom Whitenack David Valentine Support EAR

2 Outline CUAHSI HIS data publication system Observations data Model (ODM) Using ODM SQL Management Studio ODM Tools ODM Data Loader Streaming Data Loader Controlled vocabulary editing SSIS for data loading HIS Central Data Publication Workflow WaterML and configuring WaterML web services over ODM Registration of your data service in HIS central Tarboton Horsburgh Tarboton Zaslavsky Valentine Whitenack 5 min 20 min 40 min 10 min 15 min 20 min

3 Base Station Computer(s) Telemetry Network Sensors HIS Data Publication System Query, Visualize, and Edit data using ODM Tools Excel Text ODM Database ODM Data Loader Streaming Data Loader GetSites GetSiteInfo GetVariableInfo GetValues WaterOneFlow Web Service WaterML Discovery Hydroseek Access Analysis GIS Matlab Splus R IDL Java C++ VB Water Metadata Catalog Harvester Service RegistryHydrotagger HIS Central HydroExcel HydroGet HydroLink HydroObjects ODM Contribute your ODM

4 Steps in publishing data 1.Server setup or access 2.Store observations in ODM 3.Provide access to data through web services (http:// / /cuahsi_1_0.asmx?WSDL)http:// / /cuahsi_1_0.asmx?WSDL 4.Index the resulting water data service at HIS Central (http://hiscentral.cuahsi.org)http://hiscentral.cuahsi.org

5 Why an Observations Data Model Syntactic heterogeneity (File types and formats) Semantic heterogeneity –Language for observation attributes (structural) –Language to encode observation attribute values (contextual) Publishing and sharing research data Metadata to facilitate unambiguous interpretation Enhance analysis capability

6 Scope Focus on Hydrologic Observations made at a point Exclude Remote sensing or grid data. These are part of a digital watershed but not suitable for an atomic database model and individual value queries Primarily store raw observations and simple derived information to get data into its most usable form. Limit inclusion of extensively synthesized information and model outputs at this stage.

7 What are the basic attributes to be associated with each single data value and how can these best be organized? Value DateTime Variable Location Units Interval (support) Accuracy Offset OffsetType/ Reference Point Source/Organization Censoring Data Qualifying Comments Method Quality Control Level Sample Medium Value Type Data Type

8 Data Source Network Sites Variables Values {Value, Time, Qualifier, Offset} Utah State Univ Little Bear River Little Bear River at Mendon Rd Dissolved Oxygen 9.78 mg/L, 1 October 2007, 6PM A data source operates and provides data to an observation network A network is a set of observation sites (stored in a single ODM instance) A site is a point location where one or more variables are measured A variable is a measured property (e.g. describing the flow or quality of water) A value is an observation of a variable at a particular time A qualifier is a symbol that provides additional information about the value An offset allows specification of measurements at various depths in water GetSites GetSiteInfo GetVariableInfo GetValues Horsburgh, J. S., D. G. Tarboton, D. R. Maidment and I. Zaslavsky, (2008), "A Relational Model for Environmental and Water Resources Data," Water Resour. Res., 44: W05406, doi: /2007WR Point Observations Information Model

9 CUAHSI Observations Data Model Streamflow Flux tower data Precipitation & Climate Groundwater levels Water Quality Soil moisture data Variables Space Time A relational database at the single observation level (atomic model) Stores observation data made at points Metadata for unambiguous interpretation Traceable heritage from raw measurements to usable information Standard format for data sharing Cross dimension retrieval and analysis

10 CUAHSI Observations Data Model

11 Site Attributes SiteCode, e.g. NWIS: SiteName, e.g. Logan River Near Logan, UT Latitude, Longitude Geographic coordinates of site LatLongDatum Spatial reference system of latitude and longitude Elevation_m Elevation of the site VerticalDatum Datum of the site elevation Local X, Local Y Local coordinates of site LocalProjection Spatial reference system of local coordinates PosAccuracy_m Positional Accuracy State, e.g. Utah County, e.g. Cache

12 1 1 CouplingTable SiteID HydroID Sites SiteID SiteCode SiteName Latitude Longitude … Observations Data Model 1 1 OR Independent of, but can be coupled to Geographic Representation ODM Arc Hydro

13 Variable attributes VariableName, e.g. discharge VariableCode, e.g. NWIS:0060 Speciation, e.g. N in “Nitrogen, NH4 as N” SampleMedium, e.g. water ValueType, e.g. field observation, laboratory sample IsRegular, e.g. Yes for regular or No for intermittent TimeSupport (averaging interval for observation) DataType, e.g. Continuous, Instantaneous, Categorical GeneralCategory, e.g. Climate, Water Quality NoDataValue, e.g m 3 /s Flow Cubic meters per second

14 Scale issues in the interpretation of data The scale triplet From: Grayson, R. and G. Blöschl, ed. (2000), Spatial Patterns in Catchment Hydrology: Observations and Modelling, Cambridge University Press, Cambridge, 432 p, a) Extentb) Spacing c) Support length or time quantity length or time quantity length or time quantity

15 The effect of sampling for measurement scales not commensurate with the process scale (b) extent too small – trend (c) support too large – smoothing out (a) spacing too large – noise (aliasing) From: Grayson, R. and G. Blöschl, ed. (2000), Spatial Patterns in Catchment Hydrology: Observations and Modelling, Cambridge University Press, Cambridge, 432 p,

16 Discharge, Stage, Concentration and Daily Average Example

17 Data Types Continuous (Frequent sampling - fine spacing) Sporadic (Spot sampling - coarse spacing) Cumulative Incremental Average Maximum Minimum Constant over Interval Categorical

18 Offset OffsetValue Distance from a datum or control point at which an observation was made OffsetType defines the type of offset, e.g. distance below water level, distance above ground surface, or distance from bank of river

19 Water Chemistry from a profile in a lake

20 Groups and Derived From Associations

21 Stage and Streamflow Example

22 Daily Average Discharge Example Daily Average Discharge Derived from 15 Minute Discharge Data

23 ValueAccuracy A numeric value that quantifies measurement accuracy defined as the nearness of a measurement to the standard or true value. This may be quantified as an average or root mean square error relative to the true value. Since the true value is not known this may should be estimated based on knowledge of the method and measurement instrument. Accuracy is distinct from precision which quantifies reproducibility, but does not refer to the standard or true value. Accurate Low Accuracy, but precise Low Accuracy ValueAccuracy

24 Data Quality Qualifier Code and Description provides qualifying information about the observations, e.g. Estimated, Provisional, Derived, Holding time for analysis exceeded QualityControlLevelCode records the level of quality control that the data has been subjected to Raw Data Partly quality controlled - 1. Quality Controlled Data - 2. Derived Products - 3. Interpreted Products - 4. Knowledge Products

25 Series of Observations A “Data Series” is a set of all the observations of a particular variable at a site. The SeriesCatalog is programmatically generated to provide users with the ability to do data discovery (i.e. what data is available and where) without formulating complex queries or hitting the DataValues table which can get very large.

26 Loading data into ODM Interactive OD Data Loader (OD Loader) –Loads data from spreadsheets and comma separated tables in simple format Scheduled Data Loader (SDL) –Loads data from datalogger files on a prescribed schedule. –Interactive configuration SQL Server Integration Services (SSIS) –Microsoft application accompanying SQL Server useful for programming complex loading or data management functions OD Data Loader SDL SSIS

27 Extra examples in case of questions

28 Methods and Samples Method specifies the method whereby an observation is measured, e.g. Streamflow using a V notch weir, TDS using a Hydrolab, sample collected in auto-sampler SampleID is used for observations based on the laboratory analysis of a physical sample and identifies the sample from which the observation was derived. This keys to a unique LabSampleID (e.g. bottle number) and name and description of the analytical method used by a processing lab.

29 Water Chemistry from Laboratory Sample

30 Incomplete or Inexact daily total occurring. Value is not a true 24-hour amount. One or more periods are missing and/or an accumulated amount has begun but not ended during the daily period. 15 min Precipitation from NCDC

31 Irregularly sampled groundwater level


Download ppt "How to share and publish your data using HIS David G Tarboton Jeff Horsburgh Ilya Zaslavsky Tom Whitenack David Valentine Support EAR 0622374"

Similar presentations


Ads by Google