Presentation is loading. Please wait.

Presentation is loading. Please wait.

GCE Data Toolbox -- metadata-based tools for automated data processing and analysis Wade Sheldon University of Georgia GCE-LTER.

Similar presentations


Presentation on theme: "GCE Data Toolbox -- metadata-based tools for automated data processing and analysis Wade Sheldon University of Georgia GCE-LTER."— Presentation transcript:

1 GCE Data Toolbox -- metadata-based tools for automated data processing and analysis Wade Sheldon University of Georgia GCE-LTER

2 Rationale  Data processing, quality control, data analysis and metadata generation traditionally carried out as separate activities, often in different time frames using different technologies  Problems:  Metadata may not reflect all processing steps  Much routine data analysis done w/o Q/C, metadata  No economy of scale – leads to “one-off” solutions  Metadata generation should ideally occur throughout the data cycle and “inform” data analysis

3 Design Goals  Develop Integrated Storage Standard  Tabular Data  QA/QC Information  Metadata (overall data set & columns/attributes)  Develop Software to Support Standard  Code Library/API  User Interfaces  Apply Technology to Acquire, Manage, Distribute GCE-LTER Data  Explore Use as Prototype Technology for Metadata-based Data Processing, Synthesis

4 Storage Standard  Developed Using MATLAB ®  Local expertise, large scientific user base  Cross-platform (Win32, Solaris, *nix, Mac OS/x)  Rapid development environment  Supports multiple interfaces (interactive command line, batch- mode scripts, GUI, WWW)  Good interoperability with other technologies (Java, PERL, SQL)  Defined “GCE Data Structure” Spec. (based on MATLAB/C structures)  Structure with 17 named fields  Specific content rules for each field (software validation)  Combines data, metadata, QA/QC, processing history

5 Storage Standard GCE Data Structure Specification (v1.1)

6 Software – GCE Data Toolbox  Core Function Library  Create, Validate Structures  Import Data, Metadata (ASCII, MATLAB, SQL)  Manipulate Data, Metadata (unit conversions, add/delete/update)  Export Data, Metadata (various formats)  Dynamic, Rule-base QA/QC Flagging  Self-documenting Processing  Operation Logging (Processing History)  Transparent Metadata Creation/Updating  Dynamic (JIT) Metadata Generation for Columns  Support for Metadata “Templating”  Application of Boilerplate Metadata based on Parameter Matching  Supports Rapid Documentation of Routine Data Sources

7 Software – GCE Data Toolbox  Support for Analysis  Descriptive Statistics, Reports  Visualization, Mapping  Support for Synthesis  Composite Data Set Creation  Multiple Data Set Merge/Concatenation  Relational Join  Metadata Content Meshing  Data Set Summarization  Statistical Data Reduction/Re-sampling  Data Set Standardization  Unit Conversions (automatic, interactive)  Template-based Semantic Mapping  Automatic Semantic Mediation (prototype stage)

8 Software – User Interfaces  Unattended Batch Mode Processing  Interactive Command Line Processing (conventional MATLAB UI)  Full help text for each function  Well-defined input/output arguments  GUI Applications  Standard Forms, Dialogs, Controls  No MATLAB Experience Required  WWW – MATLAB Web Server  HTML Forms, Querystring Input  HTML Pages and/or Static File Output

9 Command-Line Interface

10 GUI Applications

11 WWW Interface

12 Current Applications  Automated Data Processing  Direct data import from data logger files, WWW data sources (USGS), SQL queries  Automatic metadata creation (templates, data mining)  Rule-based QA/QC flagging  Data Set Packaging  Batch processing to create/update data, metadata products  On-demand generation of data, metadata, stat reports in custom formats (end-user scripts, GUI applications, WWW forms)

13 Current Applications  Data Exploration/Analysis by PIs  Descriptive Statistics based on attribute metadata  Visualization with Interactive Filtering ( Frequency Histograms, 2D Plots, Map Plots)  Data Reduction/Re-sampling to Provide Customized Data at Various “Scales”  Aggregated Statistics  Binned Statistics  Query/Filtering (sub-selection)

14 Current Applications  Data Harvesting (GCE)  USGS Data (WWW real-time, daily, finalized data)  Campbell Scientific Data Arrays (post-processing triggered after LoggerNet Retrieval)  Sea-Bird Hydrographic Data  USGS Data Harvesting Service for HydroDB  Weekly harvest for 31 stations/7 LTER Sites  Automatic Resampling, Unit Conversions, Q/C

15 Availability  Description, Screen-shots, Fully-functional Toolbox Available on WWW: http://gce-lter.marsci.uga.edu/lter/research/tools/data_toolbox.htm  Requires MATLAB 5.3, 6.0, 6.5 (any platform)  “Public” Version Compiled  Source Code Requests Considered on Case-by- Case Basis

16 Future Development Plans  EML 2.0 Support  Metadata-mediated Data Set Integration  Unit conversions  Re-sampling  More WWW Interface Development


Download ppt "GCE Data Toolbox -- metadata-based tools for automated data processing and analysis Wade Sheldon University of Georgia GCE-LTER."

Similar presentations


Ads by Google