Presentation is loading. Please wait.

Presentation is loading. Please wait.

Source Description-based Approach for the Modeling of Spatial Information Integration Yoshiharu Ishikawa and Hiroyuki Kitagawa University of Tsukuba

Similar presentations


Presentation on theme: "Source Description-based Approach for the Modeling of Spatial Information Integration Yoshiharu Ishikawa and Hiroyuki Kitagawa University of Tsukuba"— Presentation transcript:

1 Source Description-based Approach for the Modeling of Spatial Information Integration Yoshiharu Ishikawa and Hiroyuki Kitagawa University of Tsukuba {ishikawa,kitagawa}@is.tsukuba.ac.jp

2 Outline Background Our Objective and Approach Motivating Example Data Model Query Specification and Source Description Query Processing Conclusions and Future Work

3 Background: Spatial Information Sources (1) Spatial information sources: emerging new information sources on the Internet information sources that provide region- or location-oriented information some of them support mobile users with GPSs and hand-held devices

4 Background: Spatial Information Sources (2) Need for the technology to integrate spatial information sources description of spatial information sources by taking their contents into consideration efficient and effective query planning and processing Spatial Information Integration

5 Background: Spatial Information Sources (3) Standarization Efforts of Spatial Technologies OpenGIS [5]: standardization of GIS system POIX [6]: language for location-oriented information exchange G-XML [7]: XML vocaburary for geographic information description RWML [8]: road information description language Spatial Information Services Digital City [10], citysearch.com [11]: location-oriented information services Ekimae Tanken Club [12]: provides local information nearby a specified rail station MONET system [13]: provides information for car drivers

6 Background: Heterogeneous Information Integration (1) Popular approach for information integration well-known wrapper-mediator approach Wrapper encapsulates the detail of each information source provides abstract uniform view of the source Mediator selects appropriate information sources for a given query query planning and processing

7 Unified Access to the Integrated Information Heterogeneous Information Integration System Wrapper Mediator Information Source B Information Source C Information Source D Wrapper Information Source A Background: Heterogeneous Information Integration (2)

8 Outline Background Our Objective and Approach Motivating Example Data Model Query Specification and Source Description Query Processing Conclusions and Future Work

9 Our Objective Development of a spatial information integration framework for location-aware information services integration of heterogeneous spatial information sources heterogeneity of the contents of the sources heterogeneity of the capabilities of the sources provide useful location-oriented information service to mobile users selection of neighborhood geometric features

10 Our Approach Development of a description method to represent spatial information sources based on the source description framework: describes the contents and the service of the source introduction of spatial data types and spatial operators: based on OpenGIS standard Development of query planning and processing methods that effectively utilize source descriptions selection of appropriate information sources for a given query effective use of the query processing power of each information source

11 Outline Background Our Objective and Approach Motivating Example Data Model Query Specification and Source Description Query Processing Conclusions and Future Work

12 Motivating Example (1) Heterogeneous Information Integration System Wrapper Mediator Information Source B Information Source C Information Source D Wrapper Information Source A

13 Global Schema based on the relational model represents a virtual database schema each information source is (partially) mapped to the global schema relation Restaurant { relation Evalouation { name string; name string; category string; score real; address string; }; location point; }; Motivating Example (2)

14 Motivating Example (3) Query issued by the user: show top-20 nearest restaurants such that within 1000 meters from the current position the score is more than or equal to 2.5 stars 1000m 1 2 3 4 5 67 SELECT r.name, r.address FROM Restaurant as r, Evaluation as e WHERE r.name = e.name, e.score >= 2.5 Distance(r.location, p) <= 200 ORDER BY Distance (r.location, p) STOP AFTER 20 SQL representation p

15 Motivating Example (4) Information Source A: provides restaurant info for a specific area Contents: contains information of restaurants within the rectangle area r Capability: given name or address, it returns the matched restaurants r

16 Motivating Example (5) Information Source B: supports spatial conditions to query restaurant info Contents: contains information about restaurants Capability returns restaurants within the specified circle area receives additional condition on restaurant category category = “Chinese”

17 Motivating Example (6) Information Source C: supports spatial conditions to query restaurant info Contents: contains information about restaurants Capability returns restaurants that match the specified name if an optional polygon is given, it only returns restaurants within the specified polygon region name like “%Sushi”

18 Motivating Example (7) Information Source D: provides restaurant evaluation scores given restaurant name, it returns the evaluation score select * from Source-D where name like “%Sushi” name Tokyo Sushi score 3.0 Edo Sushi 2.7

19 Outline Background Our Objective and Approach Motivating Example Data Model Query Specification and Source Description Query Processing Conclusions and Future Work

20 Data Model for Integration The relational model enhanced with spatial data types and spatial operations Spatial data types and spatial operations are based on OpenGIS proposal [5] A wrapper for each spatial information source wraps the operations of the source, then provides OpenGIS-conformed operations A wrapper for a source provides a subset of OpenGIS operations, depending on the capability of the source

21 Based on OpenGIS Proposal To simplify the problem, we only considers Point, LineString, and Polygon types Geometry MultiPoint MultiCurve MultiSurface PointCurveSurface Geometry Point GeometryCollection CurveSurface LineStringPolygon MultiPoint MultiCurve MultiSurface Our Target Spatial Data Types

22 intersects(g 1,g 2 ) disjoint(g 1,g 2 ) equals(g 1,g 2 ) overlaps(g 1,g 2 ) contains(g 1,g 2 ) within(g 1,g 2 ) crosses(g 1,g 2 ) touches(g 1,g 2 ) g 1 and g 2 have intersections g 1 and g 2 ao not have any overlap g 1 and g 2 are equal g 1 and g 2 have one or more overlaps g 1 contains g 2 g 1 is contained in g 2 g 1 and g 2 have intersections g 1 and g 2 touch at one or more points Spatial Operations (1) Spatial Predicates of OpenGIS

23 Spatial Functions of OpenGIS intersection(g 1,g 2 ) distance(g 1,g 2 ) envelope(g) union(g 1,g 2 ) isempty(g) Integer Double Geometry g is empty mindist between g 1 and g 2 MBB of g unified region of g 1 and g 2 intersection of g 1 and g 2 namereturn type semantics Spatial Operations (2)

24 Outline Background Our Objective and Approach Motivating Example Data Model Query Specification and Source Description Query Processing Conclusions and Future Work

25 Source Description Framework Source Description Framework: a formal framework to specify meta information for an information source proposed by Information Manifold [3] A source description consists of: Contents Description: describes the contents of the source in terms of the global schema Capability Description: describes the types of queries which the source can support We extend the source description approach by considering OpenGIS data types and operations

26 Query Description An extension of a conjunctive query: it can contain spatial predicates (e.g., intersects, contains ) spatial functions (e.g., envelope, distance ) use of additional comparison operators (e.g., ≤) General form of a conjunctive query R 1, …, R n : global relations u, u 1,…,u n : sequences of variables c 1,…,c m (m  0) : conditions ans (u)  R 1 (u 1 ),…, R n (u n ), c 1,…,c m Query Description (1)

27 ans ( n, a )  Restaurant(n, c, a, l), Evaluation(e, s), n = e, s  2.5, distance(l, p)  1000 Show restaurants within 1000 meters from the current position and their scores are larger than or equal to 2.5 stars SELECT r.name, r.address FROM Restaurant as r, Evaluation as e WHERE r.name = e.name, e.score >= 2.5 Distance(r.position, p) <= 1000 Query Description (2)

28 Spatial Query Conditions For spatial query condition, we allow the following spatial range restriction predicates ( g is a geometric constant) equals(g, g) and equals(g, g) within(g, g) contains(g, g) Also, we allow distance-based range restriction conditions (g is a Geometry object, d is a real constant, is < or ≤) distance(g, g) θ d

29 A source description consists of contents description capability description pat : mandatory input arguments (input pattern) out : denotes the condition issued to the underlying source when the input arguments ( pat ) are given contents : S (u)  R (u), c 1, …,c n example: S(n, c, a, l)  Restaurant(n, c, a, l), c = “Italian” filters : pat  out Source Descriptions (1)

30 Information Source A Information Source A: provides restaurant info for a specific area Contents: contains information of restaurants within the rectangle area r Capability: given name or address, it returns the matched restaurants r

31 Source A provides restaurant information provides information within r also allows retrieval by restaurant name and address Source A contents: S A  Restaurant(n, c, a, l), contains(r, l) filters:  n = n,  a = a Source Description for A

32 Information Source B Information Source B: supports spatial conditions to query restaurant info Contents: contains information about restaurants Capability returns restaurants within the specified circle area receives additional condition on restaurant category category = “Chinese”

33 Source B provides restaurant information inputs are a query point (p) and a threshold value of distances (d) allows an additional filtering condition based on the restaurant category ( c ) Source B contents: S B  Restaurant(n, c, a, l) filters:  distance(l, p)  d,  c = c Source Description for B

34 Information Source C Information Source C: supports spatial conditions to query restaurant info Contents: contains information about restaurants Capability returns restaurants that match the specified name if an optional polygon is given, it only returns restaurants within the specified polygon region name like “%Sushi”

35 Source C provides restaurant information returns restaurants that match the specified name (n) allows additional filtering condition based on polygonal region ( g ) Source C contents: S C  Restaurant(n, c, a, l) filters:  n = n,  contains(g, l) Source Description for C

36 Information Source D Information Source D: provides restaurant evaluation scores given restaurant name, it returns the evaluation score select * from Source-D where name like “%Sushi” name Tokyo Sushi score 3.0 Edo Sushi 2.7

37 Source D provides restaurant evaluation scores allows retrieval by restaurant name and/or evaluation score Source D contents: S D  Evaluation(n, s) filters:  n = n,  s θ s (θ in {=, ≠,, ≤, ≥}) Source Description for D

38 Outline Background Our Objective and Approach Motivating Example Data Model Query Specification and Source Description Query Processing Conclusions and Future Work

39 Query Plan Construction 1. Preprocessing - Validation of the correctness of the given query according to the global schema - deletion of redundant variables - simplifications of expressions 2. Selection of useful information sources based on contents description 3. Pushing query conditions into the underlying information sources as possible 4. Generation of the integrated query plan Overview of Query Processing (1)

40 Wrapper Mediator Source C Source D Source B Wrapper Source A Pushing subqueries to the sources query validity check query simplification Source selection based on contents description Integration of Subquery results query result Receives partial results Overview of Query Processing (2)

41 Contents Description used to select useful information sources to process the given query also used to eliminate redundant join conditions Capability Description used to decide whether a wrapper on a source can process the given query condition using its query processing capability also used to generate a subquery to an information source Usage of Source Descriptions

42 Unifies the given query condition and a contents description of a information source Query : ans(u)  R 1, …, R n, c 1,…,c m Contents Description : S R (v)  R i (v), e 1,…,e n possibility condition for an information source to fulfill the given query condition:  x 1 …x n (c 1  …  c m  e 1  …  e n ) = true Selection of Information Source (1)

43 Example: a query over the global schema: ans ( n )  Restaurant(n, c, a, l), distance(l, p)  1000 Source Description for E: S E (n, c, a, l)  Restaurant(n, c, a, l), c = “Italian” , contains(r, l) Source E has a possibility to satisfy the subquery if:  c, l (c = “Italian”  contains(r, l)  distance(l, p)  1000) = true Selection of Information Source (2)

44 simplification of the possibility condition:  l(contains(r, l)  distance(l, p)  1000) = true intersects(r, circle(p, 1000)) = true query region supported area by source E 1000m Selection of Information Source (3) r p

45 Example: a query over the global schema: ans ( n, m )  Restaurant(n, c, a, l), BusStop(m, p), distance(l, p)  200 Contents Description for Sources F and G: S F (n, c, a, l)  Restaurant(n, c, a, l), contains(r, l) S G (m, p)  BusStop (m, p), contains(s, p) F and G may satisfy the query if distance(r, s)  200 region of E 200m region of A Elimination of Redundant Joins

46 Pushing Query Conditions (1) Check the possibility that the given query condition can be processed by the source When the query condition and the filtering condition (supported by the source) are equivalent direct push There is no equivalent condition, but if the source has more general condition transform into more general condition then push to the source we need an additional step to check the retrieved results exactly satisfy the given query condition

47 Capability Description of the Source: Source C contents: S C  Restaurant(n, c, a, l) filters:  n = n,  contains(g, l) Query: ans ( n )  Restaurant(n, c, a, l), contains(r, l) push contains(r, p) to the source C Pushing Query Conditions (2)

48 Source Description for the Source: Source H contents: S H  Restaurant(n, c, a, l) filters:  n = n,  intersects(l, g) Query: ans ( n )  Restaurant(n, c, a, l), distance(l, p)  1000 push condition intersects(p, envelope(circle(p, 1000))) then examine distance(p, circle(p, 1000))  1000 for the retrieved data Pushing Query Conditions (3)

49 Outline Background Our Objective and Approach Motivating Example Data Model Query Specification and Source Description Query Processing Conclusions and Future Work

50 Conclusions Proposal of a framework for heterogeneous spatial information sources based on source description framework contents description capability description use of data types and operations of OpenGIS proposal query processing strategies source selection pushing query conditions Future Work investigation of source selection and query planning strategies more formal framework (e.g., constraint-based approach) Conclusions and Future Work


Download ppt "Source Description-based Approach for the Modeling of Spatial Information Integration Yoshiharu Ishikawa and Hiroyuki Kitagawa University of Tsukuba"

Similar presentations


Ads by Google