Download presentation
Presentation is loading. Please wait.
1
Approximate Queries by Relaxing Structural Constraints in GIS Arianna D’Ulizia Fernando Ferri Patrizia Grifoni IRPPS-CNR, Rome, Italy First International Workshop on Semantic and Conceptual Issues in Geographic Information Systems (SeCoGIS 2007)
2
SeCoGIS 2007, Auckland, New Zealand, November 7-8, 20072 Motivation The Geographical Pictorial Query Language (GeoPQL) The similarity model Semantic similarity Structural similarity Motivation
3
SeCoGIS 2007, Auckland, New Zealand, November 7-8, 20073 Query geographical database Null answer search condition Similar concepts Relaxing constraints Similarity model Most similar concepts: 1)Airline 2)Busline 3)Steamship EXAMPLE “Find all the Towns which are PASSED THROUGH by a Railway” Railway Town
4
SeCoGIS 2007, Auckland, New Zealand, November 7-8, 20074 Spatial constraints: spatial relationships existing between geographical objects Semantic constraints: concepts represented by the geographical objects Semantic similarity can be calculated by providing a weighted taxonomy of concepts represented by each geographical object and by evaluating their related degrees of similarity Structural similarity can be measured considering similarity of attributes, types and values of each pair of geographical objects Structural constraints: internal features of geographical objects Spatial similarity can be measured by a topological similarity graph, that links all configurations between two geographical objects by the lowest topological distance.
5
SeCoGIS 2007, Auckland, New Zealand, November 7-8, 20075 Similarity between concept names expressed in the query and concept names in the database. Two strategies: to calculate the distance between nodes corresponding to the items being compared items have to be represented as a network to evaluate the information content of items items have to be represented as an Is-a taxonomy
6
SeCoGIS 2007, Auckland, New Zealand, November 7-8, 20076 Similarity between attributes of the concept expressed in the query and concepts in the database. Research fields: Information retrieval: evaluation of distances between tree structures in terms of the hierarchical relationships of their nodes similar XML documents Ontology matching: measure of distances among concepts in terms of their structures (attributes, types, values)
7
SeCoGIS 2007, Auckland, New Zealand, November 7-8, 20077 To analyze the problem of matching a geographical query with imprecise or missing data. To propose a relaxation model that considers with different weights: the semantic similarity of geographical concepts, that is evaluated by adopting the information content approach, and the structural similarity of the attributes and types of geographical objects that is inspired by the maximum weighted matching problem in bipartite graphs.
8
SeCoGIS 2007, Auckland, New Zealand, November 7-8, 20078 Motivation The Geographical Pictorial Query Language (GeoPQL) The similarity model Semantic similarity Structural similarity The Geographical Pictorial Query Language (GeoPQL)
9
SeCoGIS 2007, Auckland, New Zealand, November 7-8, 20079 GeoPQL allows to specify queries using Symbolic Graphical Objects (SGOs) An SGO is a 4-tuple ψ = where: id is the SGO identifier; objclass is the concept name iconized by ψ; Σ represents the set of attributes of ψ; Λ is the ordered set of coordinate pairs (h, v), which defines the spatial extent and position of the SGO with respect to the coordinate reference system of the working area. DEFINITION
10
SeCoGIS 2007, Auckland, New Zealand, November 7-8, 200710 GeoPQL algebra consists of 12 operators: Geo-union, Geo-difference, Geo-disjunction, Geo-touching, Geo-inclusion, Geo-crossing (cross between two polylines), Geo-pass-through (a polyline passes through a polygon), Geo-overlapping, Geo-equality, Geo-distance, Geo-any (any relationship is valid between two SGO) Geo-alias (OR operator). explicitly expressed automatically deduced by the query’s pictorial representatio n
11
SeCoGIS 2007, Auckland, New Zealand, November 7-8, 200711 GeoPQL interfaces with ESRI’s GIS ArcView®. The geographical query drawn by the user is translated in an eXtended SQL language, called XSQL. At the end of the drawing phase the XSQL query is translated into ArcView® and executed on ArcMap® (the geographical database of ArcView®).
12
SeCoGIS 2007, Auckland, New Zealand, November 7-8, 200712 “Find all the Towns which are PASSED THROUGH by a Railway AND which INCLUDE a Church”
13
SeCoGIS 2007, Auckland, New Zealand, November 7-8, 200713 Railway(number:int,local:boolean,goods:boolean,passengers:i nt,dep_railway_station:string,dest_railway_station:string) Airline(number:int,company:string,travelers:int,dep_air_term inal:string,dest_air_terminal:string) Busline(number:int,local:boolean,passengers:int,dep_bus_ter minal:string,dest_bus_terminal:string) Steamship(number:int,passengers:int,dep_port:string,dest_por t:string) Urban_area (region:string,country:string) City (name:string,region:string,country:string, people:int) Town(name:string,region:string,country:string,inhabitants:int) Market_town(name:string,region:string,country:string,market _day:string)
14
SeCoGIS 2007, Auckland, New Zealand, November 7-8, 200714 Motivation The Geographical Pictorial Query Language (GeoPQL) The similarity model Semantic similarity Structural similarity The similarity model
15
SeCoGIS 2007, Auckland, New Zealand, November 7-8, 200715 The total similarity between two SGOs ψ 1 and ψ 2 is a weighted sum of two kinds of similarity: the semantic similarity SemSIM, that performs a conceptual comparison between the two SGOs in term of the concept names objclass 1 and objclass 2 they belong to, and the structural similarity StrSIM, that performs a comparison between ψ 1 and ψ 2 in term of their attributes Σ 1 and Σ 2, the types of Σ 1 and Σ 2.
16
SeCoGIS 2007, Auckland, New Zealand, November 7-8, 200716 Motivation The Geographical Pictorial Query Language (GeoPQL) The similarity model Semantic similarity Structural similarity
17
SeCoGIS 2007, Auckland, New Zealand, November 7-8, 200717 Starting assumption: geographical concepts are organized according to a taxonomy Each concept is associated with a weight, standing for the probability of encountering instances of that concepts along the hierarchy. Formally: Given a concept c, the probability p(c) is defined as follows: p(c) = freq(c)/M where freq(c) is the frequency of the concept c estimated using noun frequencies from large text corpora, such as WordNet and M is the total number of concepts.
18
SeCoGIS 2007, Auckland, New Zealand, November 7-8, 200718 Weighted taxonomy: association of probabilities with the concepts of the taxonomy Information content of a concept c: - log p(c) p(Line)= 131/88.312 Entity (1) Line (0.00148) Airline (0.00013) Railway (0.00084) Transit Line (0.00064) Bus line (0.00024) Railway_station (0.000011) Air_terminal (0.000011) Bus_terminal (0.000011) Steamship (0.0004) Facility (0.0004) Station (0.00026) Terminal (0.000023) Town (0.00382) Market town (0.00088) Urban area (0.00592) City (0.00362) Entity Line Airline Railway Transit Line Bus line Railway_station Air_terminal Bus_terminal Steamship Facility Station Terminal Town Market town Urban area City
19
SeCoGIS 2007, Auckland, New Zealand, November 7-8, 200719 Information content similarity (ics): maximum information content shared by the concepts divided by the information content of the compared concepts. Consider the two SGO concept names City and Town: EXAMPLE Semantic similarity between two SGOs ψ 1 and ψ 2 is: Town (0.00382) Market town (0.00088) Urban area (0.00592) City (0.00362) SemSIM( City,Town) = 2 log p(Urban_area) /(log p(City) + log p(Town)) = 2 * 7.39965 / (8.10839 + 8.03372) = 0.92
20
SeCoGIS 2007, Auckland, New Zealand, November 7-8, 200720 Motivation The Geographical Pictorial Query Language (GeoPQL) The similarity model Semantic similarity Structural similarity
21
SeCoGIS 2007, Auckland, New Zealand, November 7-8, 200721 The evaluation of structural similarity between two SGOs ψ 1 and ψ 2 requires the comparison between: Σ 1 and Σ 2, that are the sets of attributes of ψ 1 and ψ 2 respectively, the cardinalities of Σ 1 and Σ 2 and finally the types of Σ 1 and Σ 2 Attribute names similarity Attribute cardinality similarity Attribute types similarity
22
SeCoGIS 2007, Auckland, New Zealand, November 7-8, 200722 Given two SGOs ψ 1 and ψ 2 and their related sets of attributes Σ 1 and Σ 2, we consider: Then, for each set S a,b we consider the sum of the ics of the attributes of each pair of the set. The maximal sum corresponds to the attribute names similarity Sim(Σ 1,Σ 2 ). S={S a,b }
23
SeCoGIS 2007, Auckland, New Zealand, November 7-8, 200723 Railway(number:int,local:boolean,goods:boolean,dep_railway_ station:string, dest_railway_station:string, passengers:int) Airline(number:int, company:string, travelers:int, dep_air_terminal:string, dest_air_terminal:string) {(number,number), (passengers,travelers), (dep_railway_station,dep_air_terminal), (dest_railway_station,dest_air_terminal)} Sim(Railway,Airline) = 1/6 * [1+1+0.93+0.93] = 0.643
24
SeCoGIS 2007, Auckland, New Zealand, November 7-8, 200724 To evaluate the similarity between the cardinalities of Σ 1 and Σ 2, as the they are numeric values, we can consider the ratio between the minimal and maximal cardinality between Σ 1 and Σ 2 that we indicate as ||Σ min || and ||Σ max || respectively. Railway(number:int,local:boolean,goods:boolean,dep_railway_ station:string, dest_railway_station:string, passengers:int) Airline(number:int, company:string, travelers:int, dep_air_terminal:string, dest_air_terminal:string) EXAMPLE Sim(Railway,Airline) = 5/6 = 0.83
25
SeCoGIS 2007, Auckland, New Zealand, November 7-8, 200725 Given two SGOs ψ 1 and ψ 2, their related sets of attributes Σ 1 and Σ 2, and the sets of types T 1 and T 2 associated with Σ 1 and Σ 2, we consider: Then, the set of pairs of attributes with maximal cardinality is considered. The maximal cardinality divided by the maximal cardinality between Σ 1 and Σ 2, that we indicate as ||Σ max ||, corresponds to the similarity Sim(Σ 1 type,Σ 2 type)
26
SeCoGIS 2007, Auckland, New Zealand, November 7-8, 200726 Railway(number:int,local:boolean,goods:boolean,dep_railway_ station:string, dest_railway_station:string, passengers:int) Airline(number:int, company:string, travelers:int, dep_air_terminal:string, dest_air_terminal:string) Sim(Railway,Airline) = 4/6 = 0.66
27
SeCoGIS 2007, Auckland, New Zealand, November 7-8, 200727 StrSIM(Railway,Airline) = 0.33 * 0.643 + 0.33 * 0.83 + 0.33* 0.66 = 0.704 the structural similarity between Railway and Airline is: As we obtained: Sim A (Railway,Airline) = 1/6 * [1+1+0.93+0.93] = 0.643 Sim C (Railway,Airline) = 5/6 = 0.83 Sim T (Railway,Airline) = 4/6 = 0.66
28
SeCoGIS 2007, Auckland, New Zealand, November 7-8, 200728 TotSIM(Railway,Airline) = 0.5 * 0.81 + 0.5 * 0.704 = 0.757 the total similarity between Railway and Airline is: As we obtained: StrSIM(Railway,Airline) = 0.704 SemSIM(Railway,Airline) = 2*2.83 /(3.89+3.07)= =5.66/6.96=0.81
29
SeCoGIS 2007, Auckland, New Zealand, November 7-8, 200729 We proposed an approach for the evaluation of similarity among Symbolic Graphical Objects (SGOs) of GeoPQL based on the semantic similarity of SGO concept names and the structural similarity of SGO attributes and types; The similarity degree is used to provide approximate queries in the case that the concept expressed in the user query is missing or has no instances in the database. The missing concept is replaced by the most similar one.
30
SeCoGIS 2007, Auckland, New Zealand, November 7-8, 200730
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.