TWC Knowledge Evolution in Distributed Geoscience Datasets and the Role of Semantic Technologies Xiaogang (Marshall) Ma Tetherless World Constellation.

Slides:



Advertisements
Similar presentations
Maines Sustainability Solutions Initiative (SSI) Focuses on research of the coupled dynamics of social- ecological systems (SES) and the translation of.
Advertisements

Environmental Information Data Centre: enabling the discovery of CEH-held data John Watkins Deputy Director EIDC.
Making the Case for Metadata at SRS-NSF National Science Foundation Division of Science Resources Statistics Jeri Mulrow, Geetha Srinivasarao, and John.
TWC Why Data Science Matters Xiaogang (Marshall) Ma Tetherless World Constellation Rensselaer Polytechnic Institute
To facilitate readily accessible research infrastructure data to advance our understanding of Earth systems through an international community-driven effort,
OneGeology-Europe - the first step to the European Geological SDI INSPIRE Conference 2010, Session Thematic Communities: Geology Krakow, June 24 th 2010.
Environmental Terminology System and Services (ETSS) June 2007.
The MetaDater Model and the formation of a GRID for the support of social research John Kallas Greek Social Data Bank National Center for Social Research.
New Approaches to GIS and Atlas Production Infrastructure for spatial data integration: across scales and projects Ilya Zaslavsky David Valentine San Diego.
Geological Survey of Norway - concepts and contributions from Norway Sverre Iversen, Geological Survey of Norway (NGU) ICC Conference Santiago, Chile
© NERC All rights reserved BGS Linked Data Pilot – aims & objectives DNF Expert Group Meeting London, 18/11/10 John Laxton.
State Geological Survey Contributions to the National Geothermal Data System.
Data Documentation Initiative (DDI): Goals and Benefits Mary Vardigan Director, DDI Alliance.
Semantic Similarity Computation and Concept Mapping in Earth and Environmental Science Jin Guang Zheng Xiaogang Ma Stephan.
Key integrating concepts Groups Formal Community Groups Ad-hoc special purpose/ interest groups Fine-grained access control and membership Linked All content.
Linking Disparate Datasets of the Earth Sciences with the SemantEco Annotator Session: Managing Ecological Data for Effective Use and Reuse Patrice Seyed.
Data Management Development and Implementation: an example from the UK SLA Conference, Boston, June 2015 Geraldine Clement-Stoneham Knowledge and Information.
SC32 WG2 Metadata Standards Tutorial Metadata Registries and Big Data WG2 N1945 June 9, 2014 Beijing, China.
1 Yolanda Gil Information Sciences InstituteJanuary 10, 2010 Requirements for caBIG Infrastructure to Support Semantic Workflows Yolanda.
Beyond a Data Portal: A Collaborative Environment for the Deep Carbon Science Communities Han Wang, Yu Chen, Patrick West, John Erickson, Xiaogang Ma,
The Case for Data Stewardship: Preserving the Scientific Record Matthew Mayernik National Center for Atmospheric Research Version 2.0 [Review Date]
Information Systems: Modelling Complexity with Categories Four lectures given by Nick Rossiter at Universidad de Las Palmas de Gran Canaria, 15th-19th.
Introduction to OBIS-USA Biological Data, Applications, & Relationships March 14, 2011.
Publishing and Visualizing Large-Scale Semantically-enabled Earth Science Resources on the Web Benno Lee 1 Sumit Purohit 2
Environmental Terminology Research in China HE Keqing, HE Yangfan, WANG Chong State Key Lab. Of Software Engineering
An Example in The DCO Data Portal Formal Specification of Data Types in the Deep Carbon Observatory Data Portal Xiaogang (Marshall) Ma
SWWG PROJECT OVERVIEW Semantic Technologies for Integrating USGS Data.
References: [1] [2] [3] Acknowledgments:
1 Data, Information and Knowledge in the British Geological Survey Jeremy Giles.
School of Computing FACULTY OF ENGINEERING Developing a methodology for building small scale domain ontologies: HISO case study Ilaria Corda PhD student.
Semantic Cyberinfrastructure for Knowledge and Information Discovery (SCiKID) Proposal Principle Investigator: Eric Rozell Tetherless World Constellation.
U.S. Department of the Interior U.S. Geological Survey A vision for a global community Linda Gundersen Director Science Quality and Integrity US Geological.
References: [1] Branch, B.D., Fosmire, M., The role of interdisciplinary GIS and data curation librarians in enhancing authentic scientific research.
U.S. Department of the Interior U.S. Geological Survey CDI Webinar Sept. 5, 2012 Janice Gordon September 5, 2012 Semantic Technologies for Integrating.
TWC Adoption of RDA DTR and PID in Deep Carbon Observatory Data Portal Stephan Zednik, Xiaogang Ma, John Erickson, Patrick West, Peter Fox, & DCO-Data.
Metadata. Generally speaking, metadata are data and information that describe and model data and information For example, a database schema is the metadata.
Research Design for Collaborative Computational Approaches and Scientific Workflows Deana Pennington January 8, 2007.
NEON non-specialist use case; Science data reuse in a classroom Peter Fox Brian Wee Patrick West 1
Federation and Fusion of astronomical information Daniel Egret & Françoise Genova, CDS, Strasbourg Standards and tools for the Virtual Observatories.
Adoption of RDA-DFT Terminology and Data Model to the Description and Structuring of Atmospheric Data Aaron Addison, Rudolf Husar, Cynthia Hudson-Vitale.
TWC Deep Earth Computer: A Platform for Linked Science of the Deep Carbon Observatory Community Xiaogang (Marshall) Ma, Yu Chen, Han Wang, Patrick West,
Prof. Peter #twcrpi) Tetherless World Constellation Chair, Earth and Environmental Science/ Computer Science/ Cognitive.
XIth International Congress for Mathematical Geology - September 3-8, 2006 – Liège, Belgium Contribution of GeoScienceML to the INSPIRE data harmonisation.
Proof of concept study of the Socio-Ecological Research and Observation oNTOlogy (SERONTO) for integrating multiple ecological databases. Introduction.
Deepcarbon.net Xiaogang (Marshall) Ma, Yu Chen, Han Wang, John Erickson, Patrick West, Peter Fox Tetherless World Constellation Rensselaer Polytechnic.
TWC Adoption of RDA DTR and PID in Deep Carbon Observatory Data Portal Stephan Zednik, Xiaogang Ma, John Erickson, Patrick West, Peter Fox, & DCO-Data.
A Systemic Approach for Effective Semantic Access to Cultural Content Ilianna Kollia, Vassilis Tzouvaras, Nasos Drosopoulos and George Stamou Presenter:
The data standards soup … Is the most exciting topic you can dream of.
DDI and the Lifecycle of Longitudinal Surveys Larry Hoyle, IPSR, Univ. of Kansas Joachim Wackerow, GESIS - Leibniz Institute for the Social Sciences.
DCO-VIVO: A Collaborative Data Platform for the Deep Carbon Science Communities Han Wang 1 ( ), Yu Chen 1 Patrick West.
Information Modeling and Semantic Web Application For National Climate Assessment Jin Guang Zheng 1 Curt Tilmes 2
GEMET GEneral Multilingual Environmental Thesaurus leading the way to federated terminologies Stefan Jensen, Head of information services group with input.
Deepcarbon.net Xiaogang Ma, Patrick West, John Erickson, Stephan Zednik, Yu Chen, Han Wang, Hao Zhong, Peter Fox Tetherless World Constellation Rensselaer.
Semantic Similarity Computation and Concept Mapping in Earth and Environmental Science Jin Guang Zheng Xiaogang Ma Stephan.
Determining Fitness-For-Use of Ontologies through Change Management, Versioning and Publication Best Practices Patrick West 1 Stephan.
A Resource Discovery Service for the Library of Texas Requirements, Architecture, and Interoperability Testing William E. Moen, Ph.D. Principal Investigator.
 Key integrating concepts  Groups  Formal Community Groups  Ad-hoc special purpose/ interest groups  Fine-grained access control and membership 
TWC Illuminate Knowledge Elements in Geoscience Literature Xiaogang (Marshall) Ma, Jin Guang Zheng, Han Wang, Peter Fox Tetherless World Constellation.
BRIDGING THE GAP BETWEEN CLINICAL RESEARCH AND CARE Philips Research Europe Brussels, February 2012 EHR4CR WP4-Semantic Interoperability: Convergence Meeting.
CUAHSI HIS: Science Challenges Linking small integrated research sites (
Determining Fitness-For-Use of Ontologies through Change Management, Versioning and Publication Best Practices Patrick West 1 Stephan.
TWC A use case-driven iterative method for building a provenance-aware GCIS ontology Xiaogang Ma a, Jin Guang Zheng a, Justin Goldstein b,c, Linyun Fu.
The Proliferation of Metadata Standards and the Evolution of NASA’s Global Change Master Directory (GCMD) Standard for Uses in Earth Science Data Discovery.
TWC Adoption* of RDA DTR and PIT in the Deep Carbon Observatory Data Portal Xiaogang Ma, John Erickson, Patrick West, Stephan Zednik, Peter Fox, & the.
Poster: EGU Glossary: USGCRP – United States Global Change Research Program NCA – National Climate Assessment GCIS – Global Change Information.
Xiaogang Ma, John Erickson, Patrick West, Stephan Zednik, Peter Fox,
Ontology Evolution: A Methodological Overview
Data types and persistent identifiers in
Metadata Construction in Collaborative Research Networks
Bird of Feather Session
Presentation transcript:

TWC Knowledge Evolution in Distributed Geoscience Datasets and the Role of Semantic Technologies Xiaogang (Marshall) Ma Tetherless World Constellation Rensselaer Polytechnic Institute x.marshall.ma rpi.edu/~max MarshallXMa

TWC William Smith's 1815 geologic map of England and Wales with part of Scotland William Smith ( ) (Image source: Geological Society of London)

TWC 1874 (Image source: British Geological Survey) 1906 Evolution of the Geological Map of British Islands / UK

TWC Definition of “Quaternary” in several versions of the International Stratigraphic Chart Sorry, no Quaternary…

TWC 5

(Haq, 2007) Distributed datasets: Regional geologic time scales

TWC (Haq, 2007) Distributed datasets: Regional geologic time scales

TWC 8 Distributed datasets: Mismatches of geological units across political boundaries Italy/France near Cuneo/Colmar CambrianCarboniferous (Asch et al., 2012) (Ma et al., 2014) Felsic and hornblendic gneisses Granitic rocks Wyoming/Colorado (Base map courtesy: OneGeology-Europe and USGS)

TWC Data and models, vocabularies, and ontologies –Have we ever had model-independent datasets? Ontology dynamics and a data life cycle 9 CONCEPT *Initial concepts *Questions and answers *Grant info CONCEPT *Initial concepts *Questions and answers *Grant info COLLECTION *Questionnaire *Coded instrument *CAI metadata *Paradata COLLECTION *Questionnaire *Coded instrument *CAI metadata *Paradata PROCESSING *Data specs *Recodes *Summary descriptive info PROCESSING *Data specs *Recodes *Summary descriptive info DISTRIBUTION *Terms of use *Citation *Packaging info DISTRIBUTION *Terms of use *Citation *Packaging info DISCOVERY *Catalog record *Indexing *Related publications DISCOVERY *Catalog record *Indexing *Related publications ANALYSIS *Replication code *Publications ANALYSIS *Replication code *Publications ARCHIVING *Preservation metadata *Confidentiality *Additional processing ARCHIVING *Preservation metadata *Confidentiality *Additional processing REPURPOSING *Post-hoc harmonization *Data transformations REPURPOSING *Post-hoc harmonization *Data transformations Diagram reproduced from (Spencer, 2012)

TWC Ontology dynamics Ontology Mapping Ontology Morphism Ontology Matching Ontology Articulation Ontology Translation Ontology Evolution Ontology Debugging Ontology Versioning Ontology Integration Ontology Merging 10 (Flouris et al., 2008)

TWC Potential challenges Reworking of the extant data in a data center –e.g. caused by ontology/vocabulary versioning Semantic mismatch among data sources –e.g. heterogeneity in ontologies of the same topic Differentiated understanding of a same piece of dataset between data providers and data users –e.g. a data provider understands Quaternary as Ma-present, and a data user understands it as Ma-present Error propagation in cross-discipline data re-use –e.g. heterogeneous datasets may cause misconception in subsequent works 11 (Ma et al., 2014)

TWC OneGeology-Europe 20 European nations providing national geologic maps at scale ~1: 1M Harmonized geological terms and map legends Multilingual labels in 18 languages Central portal for data browsing/query among distributed data sources A contribution to INSPIRE 12 A few recent works of interest

TWC 13 Federated query: Result of geologic units with age ‘Cenozoic - from 66 million years to today’

TWC 14 Earth Resource Form Environmental Impact Value Exploration Activity Type Exploration Result UNFC Value Earth Resource Expression Earth Resource Shape Enduse Potential Mineral Occurrence Type Mining Activity Type Processing Activity Type Mining Waste Type Value Commodity Code Mineral Deposit Group Mineral Deposit Type Product Value Recently finished CGI vocabularies Construct a collection of vocabularies for populating information interchange documents and enabling interoperability Provide labels for concepts, scope to various communities defined by language, science domain, or application domain CGI Geoscience Terminology Workgroup geoscience_terminology_working_group.html

TWC 15 USGS Online Geologic Maps Standardized vocabulary with detailed annotation Forward and backward queries between spatial data and attribute data Links to further data sources, e.g. aeromagnetic survey, mineral resources data, soils, geochemical samples, etc. state/map.html

TWC 16 Records of a point in the San Francisco area

TWC Recommendations Communities of practice on ontology and vocabulary –Bottom-up, self-organized, and loose top-down control Formalize the ‘Concept’ step in a data life cycle –Top-down, and adopt outputs from the bottom-up approach Make it a virtuous circle among the bottom-up and top- down approaches 17 Thanks for