Source Description-based Approach for the Modeling of Spatial Information Integration Yoshiharu Ishikawa and Hiroyuki Kitagawa University of Tsukuba

Slides:



Advertisements
Similar presentations
Translating WFS Query to SQL/XML Query Vânia Vidal Fernando Lemos Fábio Feitosa Departamento de Computação Universidade Federal do Ceará
Advertisements

Querying on the Web: XQuery, RDQL, SparQL Semantic Web - Spring 2006 Computer Engineering Department Sharif University of Technology.
Department of Geoinformation Science Technische Universität Berlin Geo-Databases: lecture 8 Management of Spatial Data Prof. Dr. Thomas H. Kolbe Institute.
From portions of Chapter 8, 9, 10, &11. Real world is complex. GIS is used model reality. The GIS models then enable us to ask questions of the data by.
Presented by: Thabet Kacem Spring Outline Contributions Introduction Proposed Approach Related Work Reconception of ADLs XTEAM Tool Chain Discussion.
CS 599 – Spatial and Temporal Databases Realm based Spatial data types: The Rose Algebra Ralf Hartmut Guting Markus Schneider.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 52 Database Systems I Relational Algebra.
Interactive Generation of Integrated Schemas Laura Chiticariu et al. Presented by: Meher Talat Shaikh.
Spatial Information Systems (SIS) COMP Spatial queries and operations.
1 COS 425: Database and Information Management Systems XML and information exchange.
Geographic Information Systems
1 Lecture 13: Database Heterogeneity Debriefing Project Phase 2.
2005Integration-intro1 Data Integration Systems overview The architecture of a data integration system:  Components and their interaction  Tasks  Concepts.
GloServ: Global Service Discovery Architecture Knarig Arabshian and Henning Schulzrinne IRT internal talk April 26, 2005.
ReQuest (Validating Semantic Searches) Norman Piedade de Noronha 16 th July, 2004.
ebis/etat/ebuy/xdia Joint Effort ebis/etat/ebuy/xdia Joint Effort2 Introduction Extensible Markup language XML SCHEMA DTD.
XML –Query Languages, Extracting from Relational Databases ADVANCED DATABASES Khawaja Mohiuddin Assistant Professor Department of Computer Sciences Bahria.
1 Overview of Database Federation and IBM Garlic Project Presented by Xiaofen He.
Knowledge Mediation in the WWW based on Labelled DAGs with Attached Constraints Jutta Eusterbrock WebTechnology GmbH.
JTS Topology Suite JTS Topology Suite An API for Processing Linear Geometry Martin Davis, Senior Technical Architect
Spatial Data Models. What is a Data Model? What is a model? (Dictionary meaning) A set of plans (blueprint drawing) for a building A miniature representation.
Spatial Database Souhad Daraghma.
A Unified Framework for the Semantic Integration of XML Databases
Context Tailoring the DBMS –To support particular applications Beyond alphanumerical data Beyond retrieve + process –To support particular hardware New.
1 XML-KSI, 2004 XML- : an extendible framework for manipulating XML data Jaroslav Pokorny Charles University Praha.
An Integration Framework for Sensor Networks and Data Stream Management Systems.
Categories of Vocabulary Compatibility Dmitry Lenkov Oracle.
6. Simple Features Specification Background information UML overview Simple features geometry.
Similarity based Retrieval from Sequence Databases using Automata as Queries 作者 : A. Prasad Sistla, Tao Hu, Vikas howdhry 出處 :CIKM 2002 ACM 指導教授 : 郭煌政老師.
Lecture 05 Structured Query Language. 2 Father of Relational Model Edgar F. Codd ( ) PhD from U. of Michigan, Ann Arbor Received Turing Award.
Querying Structured Text in an XML Database By Xuemei Luo.
Complex Data Transformations in Digital Libraries with Spatio-Temporal Information B. Martins, N. Freire, J. Borbinha Instituto Superior Técnico, Technical.
RELATIONAL FAULT TOLERANT INTERFACE TO HETEROGENEOUS DISTRIBUTED DATABASES Prof. Osama Abulnaja Afraa Khalifah
1 Lessons from the TSIMMIS Project Yannis Papakonstantinou Department of Computer Science & Engineering University of California, San Diego.
Methodology - Conceptual Database Design. 2 Design Methodology u Structured approach that uses procedures, techniques, tools, and documentation aids to.
DATABASE MGMT SYSTEM (BCS 1423) Chapter 5: Methodology – Conceptual Database Design.
Dimitrios Skoutas Alkis Simitsis
University of L’Aquila, Department of Electrical and Information Engineering
Integration of Spatial Information Sources Based on Source Description Framework Yoshiharu Ishikawa, Gihyong Ryu, and Hiroyuki Kitagawa University of Tsukuba.
1 Le Thi Thu Thuy*, Doan Dai Duong*, Virendrakumar C. Bhavsar* and Harold Boley** * Faculty of Computer Science, University of New Brunswick, Fredericton,
Object Oriented Multi-Database Systems An Overview of Chapters 4 and 5.
Database Systems Design, Implementation, and Management Coronel | Morris 11e ©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or.
The Volcano Optimizer Generator Extensibility and Efficient Search.
Efficient RDF Storage and Retrieval in Jena2 Written by: Kevin Wilkinson, Craig Sayers, Harumi Kuno, Dave Reynolds Presented by: Umer Fareed 파리드.
National Institute of Advanced Industrial Science and Technology Query Processing for Distributed RDF Databases Using a Three-dimensional Hash Index Akiyoshi.
Scaling Heterogeneous Databases and Design of DISCO Anthony Tomasic Louiqa Raschid Patrick Valduriez Presented by: Nazia Khatir Texas A&M University.
Spatial DBMS Spatial Database Management Systems.
XML and Database.
NR 143 Study Overview: part 1 By Austin Troy University of Vermont Using GIS-- Introduction to GIS.
Sept. 27, 2002 ISDB’02 Transforming XPath Queries for Bottom-Up Query Processing Yoshiharu Ishikawa Takaaki Nagai Hiroyuki Kitagawa University of Tsukuba.
1 Typing XQuery WANG Zhen (Selina) Something about the Internship Group Name: PROTHEO, Inria, France Research: Rewriting and strategies, Constraints,
August 30, 2004STDBM 2004 at Toronto Extracting Mobility Statistics from Indexed Spatio-Temporal Datasets Yoshiharu Ishikawa Yuichi Tsukamoto Hiroyuki.
Issues in Ontology-based Information integration By Zhan Cui, Dean Jones and Paul O’Brien.
Continual Neighborhood Tracking for Moving Objects Yoshiharu Ishikawa Hiroyuki Kitagawa Tooru Kawashima University of Tsukuba, Japan
Dec. 13, 2002 WISE2002 Processing XML View Queries Including User-defined Foreign Functions on Relational Databases Yoshiharu Ishikawa Jun Kawada Hiroyuki.
GEO PLACES EXPLORER PRESENTED BY KHUSHBOO BAGHADIYA SUMANA VENKATESH.
Dec. 13, 2003W 2 Implementation and Evaluation of an Adaptive Neighborhood Information Retrieval System for Mobile Users Yoshiharu Ishikawa.
A Semantic Caching Method Based on Linear Constraints Yoshiharu Ishikawa and Hiroyuki Kitagawa University of Tsukuba
03/02/20061 Evaluating Top-k Queries Over Web-Accessible Databases Amelie Marian Nicolas Bruno Luis Gravano Presented By: Archana and Muhammed.
1 Integration of data sources Patrick Lambrix Department of Computer and Information Science Linköpings universitet.
Semantic Interoperability in GIS N. L. Sarda Suman Somavarapu.
1 Chengkai Li Kevin-Chen-Chuan Chang Ihab Ilyas Sumin Song Presented by: Mariam John CSE /20/2006 RankSQL: Query Algebra and Optimization for Relational.
Relational Algebra COMP3211 Advanced Databases Nicholas Gibbins
More SQL: Complex Queries, Triggers, Views, and Schema Modification
Spatial Database Systems
Web Ontology Language for Service (OWL-S)
Geographic Information Systems
JTS Topology Suite An API for Processing Linear Geometry
Chen Li Information and Computer Science
CoXML: A Cooperative XML Query Answering System
Presentation transcript:

Source Description-based Approach for the Modeling of Spatial Information Integration Yoshiharu Ishikawa and Hiroyuki Kitagawa University of Tsukuba

Outline Background Our Objective and Approach Motivating Example Data Model Query Specification and Source Description Query Processing Conclusions and Future Work

Background: Spatial Information Sources (1) Spatial information sources: emerging new information sources on the Internet information sources that provide region- or location-oriented information some of them support mobile users with GPSs and hand-held devices

Background: Spatial Information Sources (2) Need for the technology to integrate spatial information sources description of spatial information sources by taking their contents into consideration efficient and effective query planning and processing Spatial Information Integration

Background: Spatial Information Sources (3) Standarization Efforts of Spatial Technologies OpenGIS [5]: standardization of GIS system POIX [6]: language for location-oriented information exchange G-XML [7]: XML vocaburary for geographic information description RWML [8]: road information description language Spatial Information Services Digital City [10], citysearch.com [11]: location-oriented information services Ekimae Tanken Club [12]: provides local information nearby a specified rail station MONET system [13]: provides information for car drivers

Background: Heterogeneous Information Integration (1) Popular approach for information integration well-known wrapper-mediator approach Wrapper encapsulates the detail of each information source provides abstract uniform view of the source Mediator selects appropriate information sources for a given query query planning and processing

Unified Access to the Integrated Information Heterogeneous Information Integration System Wrapper Mediator Information Source B Information Source C Information Source D Wrapper Information Source A Background: Heterogeneous Information Integration (2)

Outline Background Our Objective and Approach Motivating Example Data Model Query Specification and Source Description Query Processing Conclusions and Future Work

Our Objective Development of a spatial information integration framework for location-aware information services integration of heterogeneous spatial information sources heterogeneity of the contents of the sources heterogeneity of the capabilities of the sources provide useful location-oriented information service to mobile users selection of neighborhood geometric features

Our Approach Development of a description method to represent spatial information sources based on the source description framework: describes the contents and the service of the source introduction of spatial data types and spatial operators: based on OpenGIS standard Development of query planning and processing methods that effectively utilize source descriptions selection of appropriate information sources for a given query effective use of the query processing power of each information source

Outline Background Our Objective and Approach Motivating Example Data Model Query Specification and Source Description Query Processing Conclusions and Future Work

Motivating Example (1) Heterogeneous Information Integration System Wrapper Mediator Information Source B Information Source C Information Source D Wrapper Information Source A

Global Schema based on the relational model represents a virtual database schema each information source is (partially) mapped to the global schema relation Restaurant { relation Evalouation { name string; name string; category string; score real; address string; }; location point; }; Motivating Example (2)

Motivating Example (3) Query issued by the user: show top-20 nearest restaurants such that within 1000 meters from the current position the score is more than or equal to 2.5 stars 1000m SELECT r.name, r.address FROM Restaurant as r, Evaluation as e WHERE r.name = e.name, e.score >= 2.5 Distance(r.location, p) <= 200 ORDER BY Distance (r.location, p) STOP AFTER 20 SQL representation p

Motivating Example (4) Information Source A: provides restaurant info for a specific area Contents: contains information of restaurants within the rectangle area r Capability: given name or address, it returns the matched restaurants r

Motivating Example (5) Information Source B: supports spatial conditions to query restaurant info Contents: contains information about restaurants Capability returns restaurants within the specified circle area receives additional condition on restaurant category category = “Chinese”

Motivating Example (6) Information Source C: supports spatial conditions to query restaurant info Contents: contains information about restaurants Capability returns restaurants that match the specified name if an optional polygon is given, it only returns restaurants within the specified polygon region name like “%Sushi”

Motivating Example (7) Information Source D: provides restaurant evaluation scores given restaurant name, it returns the evaluation score select * from Source-D where name like “%Sushi” name Tokyo Sushi score 3.0 Edo Sushi 2.7

Outline Background Our Objective and Approach Motivating Example Data Model Query Specification and Source Description Query Processing Conclusions and Future Work

Data Model for Integration The relational model enhanced with spatial data types and spatial operations Spatial data types and spatial operations are based on OpenGIS proposal [5] A wrapper for each spatial information source wraps the operations of the source, then provides OpenGIS-conformed operations A wrapper for a source provides a subset of OpenGIS operations, depending on the capability of the source

Based on OpenGIS Proposal To simplify the problem, we only considers Point, LineString, and Polygon types Geometry MultiPoint MultiCurve MultiSurface PointCurveSurface Geometry Point GeometryCollection CurveSurface LineStringPolygon MultiPoint MultiCurve MultiSurface Our Target Spatial Data Types

intersects(g 1,g 2 ) disjoint(g 1,g 2 ) equals(g 1,g 2 ) overlaps(g 1,g 2 ) contains(g 1,g 2 ) within(g 1,g 2 ) crosses(g 1,g 2 ) touches(g 1,g 2 ) g 1 and g 2 have intersections g 1 and g 2 ao not have any overlap g 1 and g 2 are equal g 1 and g 2 have one or more overlaps g 1 contains g 2 g 1 is contained in g 2 g 1 and g 2 have intersections g 1 and g 2 touch at one or more points Spatial Operations (1) Spatial Predicates of OpenGIS

Spatial Functions of OpenGIS intersection(g 1,g 2 ) distance(g 1,g 2 ) envelope(g) union(g 1,g 2 ) isempty(g) Integer Double Geometry g is empty mindist between g 1 and g 2 MBB of g unified region of g 1 and g 2 intersection of g 1 and g 2 namereturn type semantics Spatial Operations (2)

Outline Background Our Objective and Approach Motivating Example Data Model Query Specification and Source Description Query Processing Conclusions and Future Work

Source Description Framework Source Description Framework: a formal framework to specify meta information for an information source proposed by Information Manifold [3] A source description consists of: Contents Description: describes the contents of the source in terms of the global schema Capability Description: describes the types of queries which the source can support We extend the source description approach by considering OpenGIS data types and operations

Query Description An extension of a conjunctive query: it can contain spatial predicates (e.g., intersects, contains ) spatial functions (e.g., envelope, distance ) use of additional comparison operators (e.g., ≤) General form of a conjunctive query R 1, …, R n : global relations u, u 1,…,u n : sequences of variables c 1,…,c m (m  0) : conditions ans (u)  R 1 (u 1 ),…, R n (u n ), c 1,…,c m Query Description (1)

ans ( n, a )  Restaurant(n, c, a, l), Evaluation(e, s), n = e, s  2.5, distance(l, p)  1000 Show restaurants within 1000 meters from the current position and their scores are larger than or equal to 2.5 stars SELECT r.name, r.address FROM Restaurant as r, Evaluation as e WHERE r.name = e.name, e.score >= 2.5 Distance(r.position, p) <= 1000 Query Description (2)

Spatial Query Conditions For spatial query condition, we allow the following spatial range restriction predicates ( g is a geometric constant) equals(g, g) and equals(g, g) within(g, g) contains(g, g) Also, we allow distance-based range restriction conditions (g is a Geometry object, d is a real constant, is < or ≤) distance(g, g) θ d

A source description consists of contents description capability description pat : mandatory input arguments (input pattern) out : denotes the condition issued to the underlying source when the input arguments ( pat ) are given contents : S (u)  R (u), c 1, …,c n example: S(n, c, a, l)  Restaurant(n, c, a, l), c = “Italian” filters : pat  out Source Descriptions (1)

Information Source A Information Source A: provides restaurant info for a specific area Contents: contains information of restaurants within the rectangle area r Capability: given name or address, it returns the matched restaurants r

Source A provides restaurant information provides information within r also allows retrieval by restaurant name and address Source A contents: S A  Restaurant(n, c, a, l), contains(r, l) filters:  n = n,  a = a Source Description for A

Information Source B Information Source B: supports spatial conditions to query restaurant info Contents: contains information about restaurants Capability returns restaurants within the specified circle area receives additional condition on restaurant category category = “Chinese”

Source B provides restaurant information inputs are a query point (p) and a threshold value of distances (d) allows an additional filtering condition based on the restaurant category ( c ) Source B contents: S B  Restaurant(n, c, a, l) filters:  distance(l, p)  d,  c = c Source Description for B

Information Source C Information Source C: supports spatial conditions to query restaurant info Contents: contains information about restaurants Capability returns restaurants that match the specified name if an optional polygon is given, it only returns restaurants within the specified polygon region name like “%Sushi”

Source C provides restaurant information returns restaurants that match the specified name (n) allows additional filtering condition based on polygonal region ( g ) Source C contents: S C  Restaurant(n, c, a, l) filters:  n = n,  contains(g, l) Source Description for C

Information Source D Information Source D: provides restaurant evaluation scores given restaurant name, it returns the evaluation score select * from Source-D where name like “%Sushi” name Tokyo Sushi score 3.0 Edo Sushi 2.7

Source D provides restaurant evaluation scores allows retrieval by restaurant name and/or evaluation score Source D contents: S D  Evaluation(n, s) filters:  n = n,  s θ s (θ in {=, ≠,, ≤, ≥}) Source Description for D

Outline Background Our Objective and Approach Motivating Example Data Model Query Specification and Source Description Query Processing Conclusions and Future Work

Query Plan Construction 1. Preprocessing - Validation of the correctness of the given query according to the global schema - deletion of redundant variables - simplifications of expressions 2. Selection of useful information sources based on contents description 3. Pushing query conditions into the underlying information sources as possible 4. Generation of the integrated query plan Overview of Query Processing (1)

Wrapper Mediator Source C Source D Source B Wrapper Source A Pushing subqueries to the sources query validity check query simplification Source selection based on contents description Integration of Subquery results query result Receives partial results Overview of Query Processing (2)

Contents Description used to select useful information sources to process the given query also used to eliminate redundant join conditions Capability Description used to decide whether a wrapper on a source can process the given query condition using its query processing capability also used to generate a subquery to an information source Usage of Source Descriptions

Unifies the given query condition and a contents description of a information source Query : ans(u)  R 1, …, R n, c 1,…,c m Contents Description : S R (v)  R i (v), e 1,…,e n possibility condition for an information source to fulfill the given query condition:  x 1 …x n (c 1  …  c m  e 1  …  e n ) = true Selection of Information Source (1)

Example: a query over the global schema: ans ( n )  Restaurant(n, c, a, l), distance(l, p)  1000 Source Description for E: S E (n, c, a, l)  Restaurant(n, c, a, l), c = “Italian” , contains(r, l) Source E has a possibility to satisfy the subquery if:  c, l (c = “Italian”  contains(r, l)  distance(l, p)  1000) = true Selection of Information Source (2)

simplification of the possibility condition:  l(contains(r, l)  distance(l, p)  1000) = true intersects(r, circle(p, 1000)) = true query region supported area by source E 1000m Selection of Information Source (3) r p

Example: a query over the global schema: ans ( n, m )  Restaurant(n, c, a, l), BusStop(m, p), distance(l, p)  200 Contents Description for Sources F and G: S F (n, c, a, l)  Restaurant(n, c, a, l), contains(r, l) S G (m, p)  BusStop (m, p), contains(s, p) F and G may satisfy the query if distance(r, s)  200 region of E 200m region of A Elimination of Redundant Joins

Pushing Query Conditions (1) Check the possibility that the given query condition can be processed by the source When the query condition and the filtering condition (supported by the source) are equivalent direct push There is no equivalent condition, but if the source has more general condition transform into more general condition then push to the source we need an additional step to check the retrieved results exactly satisfy the given query condition

Capability Description of the Source: Source C contents: S C  Restaurant(n, c, a, l) filters:  n = n,  contains(g, l) Query: ans ( n )  Restaurant(n, c, a, l), contains(r, l) push contains(r, p) to the source C Pushing Query Conditions (2)

Source Description for the Source: Source H contents: S H  Restaurant(n, c, a, l) filters:  n = n,  intersects(l, g) Query: ans ( n )  Restaurant(n, c, a, l), distance(l, p)  1000 push condition intersects(p, envelope(circle(p, 1000))) then examine distance(p, circle(p, 1000))  1000 for the retrieved data Pushing Query Conditions (3)

Outline Background Our Objective and Approach Motivating Example Data Model Query Specification and Source Description Query Processing Conclusions and Future Work

Conclusions Proposal of a framework for heterogeneous spatial information sources based on source description framework contents description capability description use of data types and operations of OpenGIS proposal query processing strategies source selection pushing query conditions Future Work investigation of source selection and query planning strategies more formal framework (e.g., constraint-based approach) Conclusions and Future Work