1 Representing Contextual Aspects of Data WP2 - Andreas Harth 1st year review Luxembourg, December 2011

2 Task 2.1 Data quality assessment and repair Task 2.2 Temporal, spatial and social aspects of data Task 2.3 Recommendati ons for enhancing best practices for data publishing D2.4 Update of D2.1 D2.3 Modelling and processing contextual aspects of data D2.5 Proof-of-concept evaluation for modelling space and time FUB 4248 D2.1 Conceptual model and best practices for high- quality data publishing D2.2 Methods for quality repair KIT KIT Work Plan View D2.6 Methods for assessing the quality of sensor data D2.7 Recommendations for contextual data publishing

3 Outline Motivation NeoGeo Vocabularies Mappings and Community Activities Demo Conclusion and Future Work

4 Motivation

5 Geospatial data is becoming increasingly relevant Location-based services, mobile applications Ever increasing amount of sensor data (phones, satellites…) Applications require integrated access to geospatial data Spatial querying Spatial reasoning

6 Source 2 Source 1 Geospatial Data Scenario Source n Wrapper 1 Mapping 2 Mapping n Integration Mapping 1...

7 Challenges Disparate data formats Integrated data format (syntax) and access (data transfer protocol) - Linked Data (RDF, HTTP) Most data sources provide just points (geo:lat, geo:long) Create vocabulary/method for publishing regions Each data source uses own way to encode geospatial data Allow for syntax differences, provide mappings where possible (KML, GML, WKT, RDF…) Instance data not interlinked across sources Create mappings between instances Geospatial data is political

8 „Unpolitical“ Datasets GADM-RDF – RDF representation of the administrative regions of the GADM project ( ) NUTS-RDF – RDF representation of Eurostat's NUTS nomenclature. GADM and NUTS serve as: New geospatial information on the Semantic Web. Bridges between already published spatial datasets. Experimentation and evaluation datasets.

9 NeoGeo Vocabularies

10 Prototypical Geo-Ontology FeatureGeometry geometry Spatial relations Spatial functions geometry „Luxembourg“ „Europe“ in

11 Features vs. Geometries NeoGeo vocabularies are based on the General Feature Model General Feature Model makes a distinction between the feature (resource to which the region belongs), and the actual geometry. Semantics of the feature are more important than the representation of the geometry. Instances of the feature are related to the type of the feature. A feature can be related to multiple distinct geometries; also allows for modelling different geometric properties for one single feature (e.g. different scales).

12 NeoGeo Vocabularies Spatial Vocabulary Representation and reasoning on topological relations based on the Region Connection Calculus (RCC). Geometry Vocabulary Representation of geo-referenced geometric shapes. spatial:Featurengeo:Geometry ngeo:geometry spatial:* RESTful services RCC: Randell, D. A., Cui, Z. and Cohn, A. G.: A spatial logic based on regions and connection, 3rd Int. Conf. on Knowledge Representation and Reasoning, 1992.

13 Modelling Spatial Relations DatasetDisjointTouchesOverlapsWithinContainsEqualsNearby UN FAOhasBorder With isInGroup Ordnance SurveydisjointtouchespartiallyO verlaps withincontainsEquals geo.linkeddata.esformaPart eDe formadoPo r LinkedGeoData GeoNames.orgneighbour /neighbou ringFeatur es parentFea ture childrenFe atures nearby / nearbyFea tures Uberblic.orgadjoining_ location containing _location DBpedialocatedInA rea NUTSPart of GADMPart of

14 RCC-8 Properties partially overlapping (PO) tangential proper part (TPP) non-tangential proper part (NTPP) equal (EQ) tangential proper part inverse (TPPi) non-tangential proper part inverse (NTPPi) externally connected (EC) disconnected (DC) proper part (PP) proper part inverse (Ppi) part (P) part inverse (Pi) overlaps (O) connects with (C) discrete from (DR)

15 Spatial Vocabulary Uses RCC for the representation of topological relations between regions Supports RCC5 and RCC8 relations Inference available for most RCC relations However some rules require „negation as failure“, which requires closed world assumption

16 Modelling Regions DatasetPointBounding BoxPoints in Lists Single predicate Literal UN FAOOwn Ordnance Survey W3C Geo / GeoRSS Own / GML geo.linkeddata.esW3C GeoOwnOwn / GML LinkedGeoDataW3C GeoOwn GeoNames.orgW3C Geo Uberblic.orgOwn DBpedia.orgW3C Geo NUTS GADM

17 Modelling Regions - Alternatives RDF List of W3C Geo Points RDF List of latitude/longitude pairs RDF Literal of latitude/longitude RDF Literal in GML/WKT format (GeoSPARQL) … Geometries have own URI, content format negotiated (KML, GML, WKT…) Geometry vocabulary describes high-level features of geospatial regions

18 Mappings and Community Activities

19 Vocabulary Mappings TO FROM NeoGeoDBpediaLinked- GeoData geo.linked- Geo- names NeoGeo-SC, SP DBpedia- Linkged- GeoData wip- geo.linked- wip- Geonames- SC: rdfs:subClassOf, SP: rdfs:subPropertyOf, SA: owl:sameAs

20 Instance Mappings TO FROM NeoGeo NUTS NeoGeo GADM DBpediaLinked- GeoData geo.linked- Geo- names NeoGeo NUTS -EQPPi NeoGeo GADM EQ-PPi tbd DBpediaEQ, PP (wip) - Linked- GeoData PP (wip)- geo.linked- PP (wp)- Geonames-

21 VoCamp Planning underway for VoCamp at UPM (jointly organised with KIT), weekend of Feb 4th and 5th Co-located with VoCamp Santa Barbara.

22 Demo!

23 Summary NeoGeo vocabularies and experiences on publishing geo-spatial data as Linked Data NUTS and GADM datasets online Integration vocabulary online, including vocabulary mappings GADM mappings to NUTS, DBpedia, LinkedGeoData… Linked Data Services for accessing/querying spatial indices (withinRegion, boundingBox) Work on similarity metrics (with optimisations and evaluation) for geospatial regions

24 Future Work Finalise NeoGeo vocabularies (VoCamp) Improvement of precision of spatial similarity; publish similarity service online More instance mappings to GADM Earth and space science data More experiments: querying of integrated data Reasoning Temporal context First deliverable due in March 2012

