24. April 1998 Dutch Cadastre 1 Efficient Storage And Retrieval for Large Spatial Data Set in a Relational DBMS Andrew U. Frank Dept. of Geoinformat ion.

Slides:



Advertisements
Similar presentations
Limitations of the relational model 1. 2 Overview application areas for which the relational model is inadequate - reasons drawbacks of relational DBMSs.
Advertisements

GIS for Politics Andrew U. Frank Geoinfo TU Vienna
Introduction to Databases
Prentice Hall, Database Systems Week 1 Introduction By Zekrullah Popal.
GI Systems and Science January 30, Points to Cover  Recap of what we covered so far  A concept of database Database Management System (DBMS) 
6/3/2015Andrew Frank1 How to Assess a GIS Application Andrew U. Frank Geoinformation TU Vienna Overheads at:
MS DB Proposal Scott Canaan B. Thomas Golisano College of Computing & Information Sciences.
File Systems and Databases
Advanced Topics COMP163: Database Management Systems University of the Pacific December 9, 2008.
Geographic Information Systems
Physical Database Monitoring and Tuning the Operational System.
Ch1: File Systems and Databases Hachim Haddouti
1 Lecture 13: Database Heterogeneity Debriefing Project Phase 2.
1 Database Systems (Part I) Introduction to Databases I Overview  Objectives of this lecture.  History and Evolution of Databases.  Basic Terms in Database.
Introduction to Databases
1 Lecture 31 Introduction to Databases I Overview  Objectives of this lecture  History and Evolution of Databases  Basic Terms in Database and definitions.
Chapter 14 The Second Component: The Database.
Object-Oriented Methods: Database Technology An introduction.
Dr. Kalpakis CMSC 461, Database Management Systems Introduction.
Geographic Information Business and Interoperability: The Future of GIS Andrew U. Frank Geoinfo TU Vienna overheads available.
Hongkong University1 Strategies for the Use of Geographic Information Systems An information centered approach Andrew U. Frank Department of Geoinformation.
Chapter 17 Methodology – Physical Database Design for Relational Databases Transparencies © Pearson Education Limited 1995, 2005.
LECTURE 2 DATABASE SYSTEM CONCEPTS AND ARCHITECTURE.
IST Databases and DBMSs Todd S. Bacastow January 2005.
10. Creating and Maintaining Geographic Databases.
MIS 710 Module 0 Database fundamentals Arijit Sengupta.
Database System Concepts and Architecture Lecture # 3 22 June 2012 National University of Computer and Emerging Sciences.
1 DATABASE TECHNOLOGIES BUS Abdou Illia, Fall 2007 (Week 3, Tuesday 9/4/2007)
1 Intro to Info Tech Database Management Systems Copyright 2003 by Janson Industries This presentation can be viewed on line at:
Limitations of the relational model. Just as the relational model supplanted the network and hierarchical model so too will the object – orientated model.
Database Design - Lecture 1
CSC271 Database Systems Lecture # 30.
© Paradigm Publishing Inc. 9-1 Chapter 9 Database and Information Management.
Functions of a Database Management System
Chapter 1 Introduction to Databases Pearson Education ©
AL-MAAREFA COLLEGE FOR SCIENCE AND TECHNOLOGY INFO 232: DATABASE SYSTEMS CHAPTER 1 DATABASE SYSTEMS (Cont’d) Instructor Ms. Arwa Binsaleh.
ITEC224 Database Programming
Database Management Exploring the Territory. Database vs Flat Files Flat Files –Characters-fields-records-files Files are not designed to work together.
Database Design - Lecture 2
Lecture 9 Methodology – Physical Database Design for Relational Databases.
Database System Concepts and Architecture
Software School of Hunan University Database Systems Design Part III Section 5 Design Methodology.
Physical Database Design Chapter 6. Physical Design and implementation 1.Translate global logical data model for target DBMS  1.1Design base relations.
© Paradigm Publishing Inc. 9-1 Chapter 9 Database and Information Management.
I Information Systems Technology Ross Malaga 4 "Part I Understanding Information Systems Technology" Copyright © 2005 Prentice Hall, Inc. 4-1 DATABASE.
Methodology - Conceptual Database Design. 2 Design Methodology u Structured approach that uses procedures, techniques, tools, and documentation aids to.
Lecture2: Database Environment Prepared by L. Nouf Almujally & Aisha AlArfaj 1 Ref. Chapter2 College of Computer and Information Sciences - Information.
Object Persistence Design Chapter 13. Key Definitions Object persistence involves the selection of a storage format and optimization for performance.
5 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved.
INFORMATION MANAGEMENT Unit 2 SO 4 Explain the advantages of using a database approach compared to using traditional file processing; Advantages including.
INFO1408 Database Design Concepts Week 15: Introduction to Database Management Systems.
Distributed DBMSs- Concept and Design Jing Luo CS 157B Dr. Lee Fall, 2003.
Elmasri and Navathe, Fundamentals of Database Systems, Fourth Edition Copyright © 2004 Pearson Education, Inc. Slide 2-1 Data Models Data Model: A set.
The GIPSIE Project Werner Kuhn Institute for Geoinformatics University of Muenster.
09/22/07Andrew Frank1 E-cadastre in Europe: about National Spatial Data Infrastructure Andrew U. Frank Geoinformation TU Vienna
1 Technology in Action Chapter 11 Behind the Scenes: Databases and Information Systems Copyright © 2010 Pearson Education, Inc. Publishing as Prentice.
Creating and Maintaining Geographic Databases. Outline Definitions Characteristics of DBMS Types of database Relational model SQL Spatial databases.
Methodology – Physical Database Design for Relational Databases.
Issues in Ontology-based Information integration By Zhan Cui, Dean Jones and Paul O’Brien.
Towards Unifying Vector and Raster Data Models for Hybrid Spatial Regions Philip Dougherty.
Jemerson Pedernal IT 2.1 FUNDAMENTALS OF DATABASE APPLICATIONS by PEDERNAL, JEMERSON G. [BS-Computer Science] Palawan State University Computer Network.
PowerPoint Presentation for Dennis, Wixom, & Tegarden Systems Analysis and Design with UML, 5th Edition Copyright © 2015 John Wiley & Sons, Inc. All rights.
1 Chapter 2 Database Environment Pearson Education © 2009.
What is GIS? “A powerful set of tools for collecting, storing, retrieving, transforming and displaying spatial data”
IIS 645 Database Management Systems DDr. Khorsheed Today’s Topics 1. Course Overview 22. Introduction to Database management 33. Components of Database.
1 Management Information Systems M Agung Ali Fikri, SE. MM.
Database Management System
Geographic Information Systems
Basic Concepts in Data Management
Presentation transcript:

24. April 1998 Dutch Cadastre 1 Efficient Storage And Retrieval for Large Spatial Data Set in a Relational DBMS Andrew U. Frank Dept. of Geoinformat ion Technical University Vienna

24. April 1998 Dutch Cadastre 2 Overview Why a Database Two Database Issues: Modeling and Implementation Base assumptions about spatio-temporal database Implementation of spatial access Modeling Interoperability Interaction: Multi-Agency Databases Open GIS

24. April 1998 Dutch Cadastre 3 Why a Database A database achieves in an agency: Integration Consistency Sharing (reduction of redundancy, but not of storage)

24. April 1998 Dutch Cadastre 4 Two Issues: Modeling what are the things to represent and how do we logically structure them Implementation how is this solved with a computer Usually only modeling is important.

24. April 1998 Dutch Cadastre 5 Implementation is crucial for DBMS because performance is critical: GIS are too large to be stored completely in main memory. Access to disk takes 10 millisec; access in main memory is 100 nsec 1 : or like 3 sec to 1 year! We therefore must start with the performance for the most often used GIS operation.

24. April 1998 Dutch Cadastre 6 Base Assumptions about Spatio-Temporal Database Objects: independently existing, with some properties and entering in some relations to other objects. Spatial objects: have a location and a spatial extend (expressed in a global coordinate system)

24. April 1998 Dutch Cadastre 7 Base Assumptions about Spatio-Temporal Database Temporal objects: change their properties in time questions about past (or future states) can be asked (valid time) Administrative database: questions about when a change became known can be asked (transaction times)

24. April 1998 Dutch Cadastre 8 Most often used operation Access to spatial data Spatial data must be retrieved quickly based on location: This is missing in a commercial DBMS. SQL can be used, but performance is insufficient Spatial clustering absolutely required

24. April 1998 Dutch Cadastre 9 Databases for Spatial Data: Classical architecture: a DBMS for the administrative data, a specialized file system for the other data Research goal: integrated database for both attribute and geometric data with spatial access method. The field tree is a methods for spatial access designed for cadastral applications)

24. April 1998 Dutch Cadastre 10 Field tree explanations: Regular grid - could cluster point objects with equal density

24. April 1998 Dutch Cadastre 11 Field tree explanations 2: Quadtree grid - could cluster point objects with irregular distribution

24. April 1998 Dutch Cadastre 12 Field tree explanations 3: But it cannot cluster extended objects

24. April 1998 Dutch Cadastre 13 Field tree explanations 4: The field tree can do it:

24. April 1998 Dutch Cadastre 14 Field tree explanations 5: Add a next level, half as large and shifted:

24. April 1998 Dutch Cadastre 15 Field tree explanations 6: Add another level (blue):

24. April 1998 Dutch Cadastre 16 Field tree explanations 7: Add another level (green):

24. April 1998 Dutch Cadastre 17 Why Field Trees ? Extended objects (represented by surrounding minimal box) are stored with a field. Fields cover the area multiply. Guarantee: Every object is on a page which is at most 4 times the size of the object. (This a quad tree based method cannot achieve) Access times depend on the amount of data retrieved, not on the amount of data stored.

24. April 1998 Dutch Cadastre 18 Query in field tree Determine which fields overlap the query window; search all (but only these) for objects of interest

24. April 1998 Dutch Cadastre 19 Spatial access research Concentration was the implementation of spatial access. Results: Mostly complex, difficult to implement methods

24. April 1998 Dutch Cadastre 20 Conclusion from spatial access research: Samet: "use a spatial clustering, use any" Problem was: unclear what criteria for optimization (nearest neighbor, range query) and what the properties of the data Identify problem before you optimize. Identification of detail of problem was not possible; lack of spatial statistics methods. The field tree has been used for cadastral and similar problems in commercial environments previously and performed well.

24. April 1998 Dutch Cadastre 21 The Issue is Integration The integration of spatial access with the other DBMS services (especially transaction management) is extremely difficult. The commercial DBMS vendors are not willing nor capable of providing spatial access built into the core of the DBMS engine; the difficulty is the integration with the transaction management system

24. April 1998 Dutch Cadastre 22 Practical solution: Spatial access/spatial clustering must be built on top of standard DB functionality (e.g. commercial relational DB). How to cluster if one cannot access the low level storage subsystem? One must exploit the B-Tree data structure, which uses physical clustering in most commercial DBMS

24. April 1998 Dutch Cadastre 23 Concept 1. Assign to each object a single number based on a spatial encoding. 2. When storing the object, use this number to achieve physical clustering by spatial location in the DBMS. 3. When searching: determine all spatial codes which fall into the query window, search these codes using the DBMS’s built in B-Tree

24. April 1998 Dutch Cadastre 24 How to encode spatial location: Cluster by Morton numbers Proposal by Abel (CSIRO) based on quad tree Disadvantage - small objects may end in very large cell (or multiple keys necessary; multiple keys cannot be used for clustering in most B-Trees)

24. April 1998 Dutch Cadastre 25 Field Code Field-tree numbers to encode the spatial location and extend of an objects (v.Oosterom’s idea) --> a single number Spatial Location Code Store spatial objects with this code (exploiting physical clustering) Search: determine the fields which may contain the objects; search these

24. April 1998 Dutch Cadastre 26 Practical results Search based on intervals of field codes; Heuristics to reduce the number of intervals submitted for range queries. Tests with real data: demonstrate speed up of search by factors from 10 to 100 times

24. April 1998 Dutch Cadastre 27 Modeling Issues Assumption: Relational DBMS - today's standard for implementation. Data model: relations, consisting of tuples of attribute values; relational calculus, SQL as 'universal data speak' (not really useful as a user query language) This is a data (value) oriented concept

24. April 1998 Dutch Cadastre 28 Object-orientation necessary OO concept necessary for spatial and temporal databases, especially cadastre: Object have identity in time (parcel id as the classical example) Objects have attribute values Objects enter in relations

24. April 1998 Dutch Cadastre 29 Object ID centered Object-Oriented data models create a data model clash, similar to the clash between Relational DBMS and sequential processing in conventional languages.

24. April 1998 Dutch Cadastre 30 Object ID centered The relation between objects and attribute values are functions from ID to attribute value relations are function from ID to ID (of the related object) A concept of a representation of an object as a contiguous data space is not necessary, but may be useful for clustering using Spatial Location Codes. This approach seems to solve most of the oo model problems discussed in the literature

24. April 1998 Dutch Cadastre 31 Future: What we can realize now are: Spatio-temporal multi-user database for a single agency. How to deal with cooperating agencies? (Your achievements demonstrates the need for this) What are the next questions?

24. April 1998 Dutch Cadastre 32 The multi-agency database: Data is shared Responsibility for the data is clearly identified Data is not centralized.

24. April 1998 Dutch Cadastre 33 The multi-agency database: This is more than a distributed DB, because it requires a new transaction concept The classical discussion of the 'long transaction', including distributed responsibility for data change within a transaction. Concept: agencies send update proposals for data they cannot change themselves to the agency which is responsible.

24. April 1998 Dutch Cadastre 34 Interoperability Agencies must cooperate. So far, we exchange data. Updates are not propagated! Future: interoperability, independent of vendor of the software (the so called Open GIS)

24. April 1998 Dutch Cadastre 35 Interoperability as a technical problem Computer network agreement on base cooperation (network standard) GIS cooperation: data model and related concepts

24. April 1998 Dutch Cadastre 36 Interoperability as a semantics problem What does the data mean? How to describe the data? How to describe the meaning of data - in a formal language to be used in a computer?

24. April 1998 Dutch Cadastre 37 Formal Language Describing natural language with formal tools not likely achieved soon. Sufficient for GIS: Definitions for restricted user communities e.g., agencies within a town

24. April 1998 Dutch Cadastre 38 Open GIS Standards Development of industry accepted standards in step with the rapid development of base technology Cooperation of all GIS vendors: Goal: Open Systems

24. April 1998 Dutch Cadastre 39 Open GIS Interoperability independent of vendor storage of data under one system analysis tools from another system

24. April 1998 Dutch Cadastre 40 Open 2 GIS Interoperability independent of agency Needs cooperation of user communities. Major users are already working in the Open GIS Consortium to assure that their application concepts are standardized.

24. April 1998 Dutch Cadastre 41 GIS User Organization gain from Open GIS standardized environments to solve application problems accumulation of knowledge of the application domain cooperation of agencies in Europe (and export of knowledge) A cadastral special interest group is discussed in OGC

24. April 1998 Dutch Cadastre 42 GIPSIE Project EU project (DG III: Information Technology) to promote Open GIS within the GI industry and user community in Europe to bring European Issues into the OGC process to contribute with research to the Open GIS standards Participation by European companies and agencies required. Contacts Andrew Frank - TU Vienna Werner Kuhn - U Muenster