Presentation is loading. Please wait.

Presentation is loading. Please wait.

Summary Report from Thursday, 3 March 2011 Pine Room Data Integration Breakout Group Geo-Data Informatics (GDI) Workshop: Exploring the Life Cycle, Citation.

Similar presentations


Presentation on theme: "Summary Report from Thursday, 3 March 2011 Pine Room Data Integration Breakout Group Geo-Data Informatics (GDI) Workshop: Exploring the Life Cycle, Citation."— Presentation transcript:

1 Summary Report from Thursday, 3 March 2011 Pine Room Data Integration Breakout Group Geo-Data Informatics (GDI) Workshop: Exploring the Life Cycle, Citation and Integration of Geo-Data

2 Discussion Prompt In your view/experience what parts of data integration implementations/applications or frameworks are well established (or not) in your discipline(s) and what are the common gaps? Moderator: Cyndy Chandler (WHOI, BCO-DMO) Rapporteur: Chris Mattmann (NASA JPL, USC) Discussion notes kept at TWC hosted titanpad site

3 Participants Bob Arko (Lamont-Doherty Earth Observatory) Joanne Luciano (TWC, RPI) Anna Milan (National Geophysical Data Center) Bob Simons (NOAA) Brian Wee (NEON, Inc.) Leslie Hsu (LDEO) Roland Viger (USGS) James Wilson (James Madison University) Tom Narock (NASA/GSFC) Cathy Constable (SIO, UCSD) Ruth Duerr (NSIDC) Yoori Choi (CUAHSI) Lee Allison, Arizona Geological Survey Erin Robinson (ESIP) Kavitha Chandrasekar, Indiana University Bob Detrick (NSF) Clifford Jacobs (NSF) Leonard Jonson (NSF)

4 Data Integration What does that mean? Combining more than one data source into a single data object. Different from display of multiple data sources in a single view. Example: a database join Time series data sets made up of a variety of sources of data often require data integration. Data aggregation and interoperability are related concepts. Group did not come to consensus.

5 Geo Disciplines Represented Geology Hydrology Oceanography Geophysics Geography Marine geology and geophysics Space science Air quality Computational neuroscience Multi-disciplinary or discipline-agnostic: data management, computer science and archive

6 Geo-Data Integration What aspects are well established or not? Identify common gaps?

7 For many projects, two common themes emerged as being associated with some level of success in ability to do data integration: – ‘long-term’ commitment of funding support – Active engagement of funding managers Examples: Unidata (Atmospheric Sciences) CUASHI (Hydrography) IRIS (Earthquake) US JGOFS, US GLOBEC, US WOCE (Ocean Sciences) ODP (Ocean Drilling) NEON

8 Support for Data Integration Development of community of practice Infrastructure to foster communication (workshops) Mentoring of students and early career PIs Development of tools (e.g. Unidata developed NetCDF which has been adopted by many communities) Education and training The persistence and recognition of a ‘named’ community can enable funds to flow from some agencies to researchers

9 Support for Data Integration Some communities agreed on common data formats that facilitated data integration Pressures from funding agencies or community needs resulted in common software tools Some communities identified ‘primary’ or ‘core’ variables (e.g. common, essential measurements)

10 Summary ‘Long-term’ funding support enables development of a community-of-practice that fosters communication, education and training, development and adoption of common tools and identification of core measurements. Communities-of-Practice can divide up the labor and work collaboratively to address shared challenges (economy of scale).

11 Additional Observations Tension between local and global (single PI to coordinated project to national to international). An awareness of global use of data could help with subsequent data integration. Early planning/specs for data management are important but traditionally difficult to obtain funding.

12 Gaps Lack of awareness/understanding that keeping data ‘alive’ (usable) is not free Many people think data stewardship and data preservation are "solved problems” (not). "bit level preservation" has been solved, but what is the useful lifespan of those files? What effort is required to make the archived data compatible with all the latest tools and technology. Ability to use a dataset declines over time, without continuing and ongoing attention to ensure that it's still meeting the current access requirements.

13 Gaps Historical or legacy data (originating PI is no longer active in the research community) no national policy for scientific preservation different disciplines have different interpretations of features in a dataset Lack of guidelines for best practices regarding metadata required to document model results * software, methodology, inputs, outputs, etc

14 Gaps Misconception that you create metadata one time, and it's forever good – not a true statement – somehow the metadata needs to be updated – systems and the infrastructure need to support this – metadata needs to evolve over time

15 Suggestion Group agreed that ESIP would be an appropriate community in which to continue these discussions and start to do some much needed planning and cross-disciplinary solutions needed to address the gaps and improve infrastructure for geo-data integration.

16 Additional Comments NRC study done 7-8 years ago about the loss of data and samples in the geosciences: http://www.nap.edu/openbook.php?record_id=10348&page=R1 Geoscience Data and Collections: NATIONAL RESOURCES IN PERIL

17 Additional Comments Marine Metadata Interoperability (MMI) http://marinemetadata.org/ Collection of ‘Guides’ on topics including Semantic Web technologies, controlled vocabularies, ontologies, standards, metadata best practices, and much more.http://marinemetadata.org/ MMI Ontology Registry and Repository (ORR) is a web application through which you can create, update, access, and map ontologies and their terms. http://mmisw.org/orr/#b http://mmisw.org/orr/#b

18 Additional CUASHI: Hydrologic Ontology System (funded by NSF) http://his.cuahsi.org/ontologyfiles.html http://water.sdsc.edu/hiscentral/startree.aspx "Data Management Plan" template available from CUAHSI (February 2011). It is available at http://www.cuahsi.org/his-dmp.html; and includes data inventory, data and metadata standards, data management life cycle, etc.

19 Additional Comments EXILIR http://www.bbsrc.ac.uk/science/international/eli xir.aspx European life science infrastructure for biological information. Its Mission: To construct and operate a sustainable infrastructure for biological information in Europe to support life science research and its translation to medicine and the environment, the bio-industries and society.


Download ppt "Summary Report from Thursday, 3 March 2011 Pine Room Data Integration Breakout Group Geo-Data Informatics (GDI) Workshop: Exploring the Life Cycle, Citation."

Similar presentations


Ads by Google