Presentation on theme: "BEER Workshop November 9, 2008 Has Data Management Gone Mainstream? Presented at the BEER Workshop Coconut Grove (Miami), Florida November 9, 2008 Robert."— Presentation transcript:
BEER Workshop November 9, 2008 Has Data Management Gone Mainstream? Presented at the BEER Workshop Coconut Grove (Miami), Florida November 9, 2008 Robert C. Groman
BEER Workshop November 9, 2008 Talk Overview Has data management gone mainstream? Data is a plural noun = facts, statistics, or items of information. Metadata = motherhood and apple pie Accessing data: Is a picture worth a thousand bytes? Data Interoperability
BEER Workshop November 9, 2008 Purpose Raise level of awareness (and appreciation) for data management Lighter and informative Want to use some formulas Difference between an engineer and a mathematician
BEER Workshop November 9, 2008 Venn Diagram: Data and Metadata All data and information (D) necessary to use the data. Data (d) Metadata (m) D m + d Facts, statistics, or items of information Set Theory
BEER Workshop November 9, 2008 Probability of having all the necessary data and information necessary to reuse someone else's data. Second order effects: Length of cruise Success of cruise Participants Immediate activity following the cruise
BEER Workshop November 9, 2008 Theorems Theorem 1: The probability that all the necessary data and information are collected and preserved to allow another researcher to properly use your data is inversely proportional to the time since the data were collected. Corollary: Unless data and information are collected and preserved during the experiment (cruise), subsequent researchers will have a difficult time using your data. Theorem 2: The longer the time since the data were collected the less likely the data will ever be considered final. Proofs are left to the reader as an exercise.
BEER Workshop November 9, 2008 Seeing Versus Using Someones Data Maybe you dont want others to use your data. Hard to believe, but this does happen. For example: –Im not done publishing my papers based on my data –My graduate student is almost done analyzing the data –Its not final yet – no, but they still may be useful –My dog ate it (no, I havent heard this one yet.) Old policies and practices about data archiving New policies about data sharing, data publishing and data archiving –Web accessible –NSF mandate (It is for real this time.) –The sum is greater than its parts
BEER Workshop November 9, 2008 The more people use your data the better they get. Heisenberg Uncertainty Principal (HUP) does NOT seem to apply If Δx and Δp are the uncertainties in the measurements of the position and momentum, then the product ΔxΔp is at least on the order of Planck's constant. Planck's constant When measuring conjugate quantities, the product of their standard deviations must be at least h / 4πconjugate quantitiesstandard deviations Not to be confused with the term observer effect (OE) which refers to changes that the act of observing will make on the phenomenon being observed.
BEER Workshop November 9, 2008 Biological and Chemical Oceanography Data Management Office BCO-DMO NSF funded 3 year project to provide short and medium term data management, including web based access, to all NSF funded projects from their biological and chemical oceanographic programs Large NSF projects are expected to have their own data management offices Web site: http://www.bco-dmo.org/http://www.bco-dmo.org/
BEER Workshop November 9, 2008 Data Stewardship a concern for creation and preservation of data and all intermediate phases - focuses …on the management of data over the long term [Baker and Chandler, 2008]; Data quality control; Treatment of all information as data fosters data re-use; Data that lack sufficient metadata has limited value beyond the research program for which they were collected; and Metadata should include sufficient information to support discovery, value assessment, and accurate re-use of the data.
BEER Workshop November 9, 2008 MapServer interface and interoperability enhancements Provides access to geo-referenced scientific data and metadata Presents distributed data sets in a unified way Uses MapServer as the visualization applicationMapServer Visualize data with graphics generated on-the-fly Request custom subsets of measurements in a variety of file formats Compare data from different sources
BEER Workshop November 9, 2008 Interoperability Ability to get someone else's data and use it on your system. (How easy is this really?) True interoperability. Get someone else's data and use it directly in your application. Do the units match and do the data acquisition and processing steps match yours or are accounted for, including instrumentation differences?
BEER Workshop November 9, 2008 JGOFS/GLOBEC Data Management System
BEER Workshop November 9, 2008 http://globec.whoi.edu/map Skip
BEER Workshop November 9, 2008 Select 5 Cruises
BEER Workshop November 9, 2008 Click on Show Data Button
BEER Workshop November 9, 2008 Select CD data in EN307
BEER Workshop November 9, 2008 Shows stations and optional grid lines
BEER Workshop November 9, 2008 EN307 graph it options
BEER Workshop November 9, 2008 Depth versus salinity and versus temperature
BEER Workshop November 9, 2008 Select another cruise: AL9906
BEER Workshop November 9, 2008 Select MOC1 data set
BEER Workshop November 9, 2008 Map it options for abundances
BEER Workshop November 9, 2008 Interoperability features (for free)
BEER Workshop November 9, 2008 MapServer Supports Interoperability Features Open Geospatial Consortium standards –Web Mapping Service (WMS), and –Show me the data –Web Feature Service (WFS) –Get me the data Retains the functionality of the JGOFS/GLOBEC Data Management System –Download data as ASCII, CSV, Matlab, NetCDF
BEER Workshop November 9, 2008 Related Activities MMI – Marine Metadata Interoperability –Promoting the exchange, integration and use of marine data through enhanced data publishing, discovery, documentation and accessibility." UNOLS Subcommittee to Report on Best Practices for the Collection of Data and Metadata at Sea to Promote Public DisseminationUNOLS –Too new to even have its own web site The Working Group on Zooplankton Ecology (WGZE), with guidance from the Working Group on Marine Data Management (WGMDM), is providing these general metadata guidelines for plankton data collected and submitted to ICES. (2003) Sensor Interoperability Metadata Workshop (2006) ICES ASC 2006 and 2008 theme sessions on data management, data sharing and related topics NOAA Coastal Services Center Data Transport Laboratory (DTL) –Integrated Ocean Observing System (IOOS) –Ocean.US data management and communications (DMAC) strategy Gulf of Maine Ocean Data Partnership Many, many more ….
BEER Workshop November 9, 2008 Metadata Schema The print size is small to protect the innocent and guilty.
BEER Workshop November 9, 2008 What is the difference between an engineer and a mathematician?
BEER Workshop November 9, 2008 References Karen, S. Baker and Cynthia L. Chandler, Enabling long-term oceanographic research: Changing data practices, information management strategies and informatics, Deep-Sea Research II, 55 (2008), 2132-2142.