Presentation is loading. Please wait.

Presentation is loading. Please wait.

TWC Knowledge Evolution in Distributed Geoscience Datasets and the Role of Semantic Technologies Xiaogang (Marshall) Ma Tetherless World Constellation.

Similar presentations


Presentation on theme: "TWC Knowledge Evolution in Distributed Geoscience Datasets and the Role of Semantic Technologies Xiaogang (Marshall) Ma Tetherless World Constellation."— Presentation transcript:

1 TWC Knowledge Evolution in Distributed Geoscience Datasets and the Role of Semantic Technologies Xiaogang (Marshall) Ma Tetherless World Constellation Rensselaer Polytechnic Institute @MarshallXMamax7@rpi.edu x.marshall.ma rpi.edu/~max7 0000-0002-9110-7369 MarshallXMa

2 TWC William Smith's 1815 geologic map of England and Wales with part of Scotland William Smith (1769-1839) (Image source: Geological Society of London)

3 TWC 1874 (Image source: British Geological Survey) 1906 Evolution of the Geological Map of British Islands / UK 1939 1969 2007 2013

4 TWC 4 20042005 20082009 Definition of “Quaternary” in several versions of the International Stratigraphic Chart Sorry, no Quaternary…

5 TWC 5

6 (Haq, 2007) Distributed datasets: Regional geologic time scales

7 TWC (Haq, 2007) Distributed datasets: Regional geologic time scales

8 TWC 8 Distributed datasets: Mismatches of geological units across political boundaries Italy/France near Cuneo/Colmar CambrianCarboniferous (Asch et al., 2012) (Ma et al., 2014) Felsic and hornblendic gneisses Granitic rocks Wyoming/Colorado (Base map courtesy: OneGeology-Europe and USGS)

9 TWC Data and models, vocabularies, and ontologies –Have we ever had model-independent datasets? Ontology dynamics and a data life cycle 9 CONCEPT *Initial concepts *Questions and answers *Grant info CONCEPT *Initial concepts *Questions and answers *Grant info COLLECTION *Questionnaire *Coded instrument *CAI metadata *Paradata COLLECTION *Questionnaire *Coded instrument *CAI metadata *Paradata PROCESSING *Data specs *Recodes *Summary descriptive info PROCESSING *Data specs *Recodes *Summary descriptive info DISTRIBUTION *Terms of use *Citation *Packaging info DISTRIBUTION *Terms of use *Citation *Packaging info DISCOVERY *Catalog record *Indexing *Related publications DISCOVERY *Catalog record *Indexing *Related publications ANALYSIS *Replication code *Publications ANALYSIS *Replication code *Publications ARCHIVING *Preservation metadata *Confidentiality *Additional processing ARCHIVING *Preservation metadata *Confidentiality *Additional processing REPURPOSING *Post-hoc harmonization *Data transformations REPURPOSING *Post-hoc harmonization *Data transformations Diagram reproduced from (Spencer, 2012)

10 TWC Ontology dynamics Ontology Mapping Ontology Morphism Ontology Matching Ontology Articulation Ontology Translation Ontology Evolution Ontology Debugging Ontology Versioning Ontology Integration Ontology Merging 10 (Flouris et al., 2008)

11 TWC Potential challenges Reworking of the extant data in a data center –e.g. caused by ontology/vocabulary versioning Semantic mismatch among data sources –e.g. heterogeneity in ontologies of the same topic Differentiated understanding of a same piece of dataset between data providers and data users –e.g. a data provider understands Quaternary as 1.806 Ma-present, and a data user understands it as 2.588 Ma-present Error propagation in cross-discipline data re-use –e.g. heterogeneous datasets may cause misconception in subsequent works 11 (Ma et al., 2014)

12 TWC OneGeology-Europe 20 European nations providing national geologic maps at scale ~1: 1M Harmonized geological terms and map legends Multilingual labels in 18 languages Central portal for data browsing/query among distributed data sources A contribution to INSPIRE http://www.onegeology-europe.org 12 A few recent works of interest

13 TWC 13 Federated query: Result of geologic units with age ‘Cenozoic - from 66 million years to today’

14 TWC 14 Earth Resource Form Environmental Impact Value Exploration Activity Type Exploration Result UNFC Value Earth Resource Expression Earth Resource Shape Enduse Potential Mineral Occurrence Type Mining Activity Type Processing Activity Type Mining Waste Type Value Commodity Code Mineral Deposit Group Mineral Deposit Type Product Value Recently finished CGI vocabularies Construct a collection of vocabularies for populating information interchange documents and enabling interoperability Provide labels for concepts, scope to various communities defined by language, science domain, or application domain CGI Geoscience Terminology Workgroup http://cgi-iugs.org/tech_collaboration/ geoscience_terminology_working_group.html

15 TWC 15 USGS Online Geologic Maps Standardized vocabulary with detailed annotation Forward and backward queries between spatial data and attribute data Links to further data sources, e.g. aeromagnetic survey, mineral resources data, soils, geochemical samples, etc. http://mrdata.usgs.gov/geology/ state/map.html

16 TWC 16 Records of a point in the San Francisco area

17 TWC Recommendations Communities of practice on ontology and vocabulary –Bottom-up, self-organized, and loose top-down control Formalize the ‘Concept’ step in a data life cycle –Top-down, and adopt outputs from the bottom-up approach Make it a virtuous circle among the bottom-up and top- down approaches 17 Thanks for listening. @MarshallXMa max7@rpi.edu


Download ppt "TWC Knowledge Evolution in Distributed Geoscience Datasets and the Role of Semantic Technologies Xiaogang (Marshall) Ma Tetherless World Constellation."

Similar presentations


Ads by Google