Problems Xenia intended to address Grants for research instrumentation which will be collecting observation data while lacking a data management/sharing component beyond archiving datalogger files Low-volume data(< 100,000 records per hour) in-situ observational platforms or system arrays (e.g. 1 to 1000 platforms collecting observations per hour) collecting data at any geographic scale (local,regional,national,etc) Bridging the gap between raw data collection and the organization and sharing of data using previously developed products, services and standards(leveraging earlier work against new data providers) Fostering a standardization of products and services via a common openly shared technical infrastructure(common database schema and product support scripts) Fostering a standardization of products and services via a common openly shared technical infrastructure(common database schema and product support scripts)
Problems Xenia not intended to address High-volume data (millions of records per hour) such as gridded model outputs, hf radar, etc. High- volume data problems at this time are better addressed using traditional file processing techniques where data management can suggest output file formats(such as images, shapefiles, etc) and metadata that are conducive to search and usage needs.
Table Schema Basic tables Extended, Support tables
Table Schema – Basic Main tables used for storing organization->platform->sensor->observation data organization->platform->sensor->observation data Not using geospatial indexing initially(can be added) to keep things simple
Current database implementation is in PostgreSQL, but should be portable to MySQL, etc later. Output products developed on Linux system using mostly perl scripts. Data dictionary captured from earlier development in the lookup tables for m_type_id (m_* = measurement) which can vary by their standard name(sea_water_temperature,sea_water_salinity) and unit of measure(celsius, fahrenheit, psu) All measurements stored in multi_obs table with their corresponding timestamp, location and qc. Multiple observation types stored similarly varying by their m_type_id index. Each measurement can/will provide a lookup for sensor id and possibly collection id.
Table Schema – Extended Additional tables used for supporting quality control tests and user/group notification Additional support tables for collections, quality control will be added
Format Convention No Convention Xenia Relational Database SQL Web Screen-Scrape ASCII Fields + Key File SEACOOS netCDF XML SQL conversion script Time Series Graphs Maps/WMS Animations Archival files by Obs/Platform CSV netCDF,shapefile,etc Latest Data by Obs/Platform KML/Google Earth,etc XML/RSS/WFS? Quality Control Notification Products
Quality Control and Notification Initial quality control tests are intended to flag/notify on observations by: Range tests - values outside of acceptable range low, range high Continuity tests – values change too much within a specific time interval Optional notification of users or user groups when qc tests fail
Time Series Graphs/Data Web request for graph only(can be placed as needed in other website contexts), webpage(graph+data) or download of time series data at specific platform sensors
Maps/WMS(Web Mapping Service) via MapServer Map animations via ImageMagick,Gifsicle, AniS DODS/OPeNDAP access to basic tables (organization, platform, sensor, multi_obs)
Latest and Archival products Guiding concept is to make products available at both regional scale(same observation/product across all platforms) and local scale(same platform across all observations/products) Often a regional product can tie into a local one – a regional water temperature map allows a user to select a water temperature graph at a specific platform listed on the map Products and design divided temporally between latest, recent(0-6 weeks), archival(3+ weeks and older). Latest products continually generated with new data(hourly) where recent and archival products may be generated at periodic intervals(daily, weekly).
Xenia latest, recent, archival table structure for observations. Oldest observations stored to files. Latest past several hours New Data Recent 0-6 weeks Archival 3+ weeks to 1-2 years Possibly table separated by year,month,etc Archival file 1-2+ years Files separated by product/year/month
Latest data products XML schema convention (ObsKML – my term/schema) Regulary(hourly) produced xml file containing all latest measurements organized by organization->platform- >observations. Designed for cross-system aggregation needs. Regulary(hourly) produced xml files (1 per platform) containing all latest measurements within that platform. Designed for local use similar to a RSS feed for each platform. Regulary(hourly) produced xml files (1 per observation) containing all latest measurements of the same observation type. Designed for cross-system aggregation needs focusing on a specific observation.
Latest data products Example of latest XML feed used to populate Carolinas Coast application and potentially further systems or Xenia instances
Latest data products KML (Keyhole Markup Language) which is the XML format used to visualize data in Google Earth and potentially other 3D Globes such as NASA WorldWind and ESRI ArcExplorer
Archival data products CSV (Comma Separated Value) files viewable using Excel Archival folder/file separated by observation type or platform month(or some manageable regular timestep) for file download according to user regional/local interest for file download according to user regional/local interest Other output file formats(netCDF, shapefiles, etc) archives similarly folder/file organized
Archival data products CSV (Comma Separated Value) files(exchange format) viewable using ODV(Ocean Data View) for CTD/Bottle analysis
Archival data products netCDF for analysis using ncBrowse
Xenia aggregation, replication, redundancy With several distributed Xenia systems, these systems could feed each other using either the same latest XML feed or a direct copy of table data offered by each Xenia instance Xenia A,B,C,D,E,F Xenia A,B,C Xenia D,E,F Xenia A Xenia B Xenia C Xenia D Xenia E Xenia F Xenia Backup A,B,C,D,E,F Xenia Backup D,E,F