Presentation is loading. Please wait.

Presentation is loading. Please wait.

Spatial Databases: Lecture 1 DT211-4 DT228-4 DT249-4 Semester 2 2009-10 Pat Browne

Similar presentations


Presentation on theme: "Spatial Databases: Lecture 1 DT211-4 DT228-4 DT249-4 Semester 2 2009-10 Pat Browne"— Presentation transcript:

1 Spatial Databases: Lecture 1 DT211-4 DT228-4 DT249-4 Semester 2 2009-10 Pat Browne http://www.comp.dit.ie/pbrowne/Spatial%20Databases%20SDEV4005/Spatial%20Databases%20SDEV4005.htm

2 Guting’s 1 definition of a spatial database  (1) A spatial database system is a database system  (2) It offers spatial data types in its data model and query language  (3) It supports spatial data types in its implementation, providing at least spatial indexing and efficient algorithms for spatial join 2.

3 Spatial Join(*) A spatial join associates two tables based on a spatial relationship, rather than an the classic non-spatial relational attribute. A spatial join operation is used to combine two or more dataset with respect to a spatial predicate or spatial operation. Predicates can be a combination of directional, distance, and topological spatial relations (e.g. overlap, contains). In case of non- spatial join, the joining attributes must of the same type, but for spatial join they can be of different types.

4 Spatial Join Example(*) 1 Query: For all the rivers listed in the River table, find the countries through which they pass. SELECT R.Name, C.Name FROM River R, Country C WHERE Cross (R.Shape,C.Shape)=1 The spatial predicate “Cross” is used to join River and Country tables.

5 Spatial Joins(*)  In practice, spatial join operations are divided into a filter step and a refinement step to efficiently process complex spatial data types such as point collections in a row instance. In the filter step, the spatial objects are represented by simpler approximations such as their Minimum Bounding Rectangle or Box (MBR or MBB).

6 Spatial Joins Example 2  A spatial join associate two tables based on a spatial relationship, rather than an attribute relationship. For example the query: Summarize the 2001 provincial election results by municipality.  Could be answered using the following SQL:

7 select m.name, sum(v.ndp) as ndp, sum(v.lib) as liberal, sum(v.gp) as green, sum(v.upbc) as unity, sum(v.vtotal) as total from bc_voting_areas v, bc_municipality m, where v.the_geom && m.the_geom and intersects(v.the_geom, m.the_geom) group by m.name order by m.name;

8 Why use a database for GIS?  GIS are not database systems, they can be connected to a DBMS.  A GIS cannot efficiently manage large quantities of non-spatial data (e.g. at government department level).  They lack ad hoc querying capability (they provide a restricted form of predefined queries)  They lack indexing structures for fast external data access (they use in memory techniques).  They lack a 'logic' (e.g. first order logic of the relational calculus)

9 Why use a database for GIS?  Databases offer the following functions:  Reliability  Integrity: enforces consistency  Security  User views  User interface  Querying  Updating  Mathematical basis  Data independence  Data Abstraction  Self-describing  Concurrency  Distributed capabilities  High performance  Supports spatial data types using ADTs.  Alternative: files

10 Why use a database for GIS? Data Abstraction- allows users to ignore unimportant details View Level – a way of presenting data to a particular group of users Logical Level – how data is interpreted when writing queries Physical Level – how data is manipulated at storage level by a computer. Most users are not interested in the physical level.

11 Databases use high level declarative languages (SQL)  Data Definition Language (DDL)  Create, alter and delete data  CREATE TABLE, CREATE INDEX  Data Manipulation Language (DML)  Retrieve and manipulate data  SELECT, UPDATE, DELETE, INSERT  Data Control Languages (DCL)  Control security of data  GRANT, CREATE USER, DROP USER

12 Spatial Types – OGC Simple Features for SQL Geometry PointCurveSurface LineString PolygonMultiSurface LineLinearRing MultiCurve MultiPolygonMultiLineString Composed Sub Type Relationship SpatialReferenceSystem GeometryCollection MultiPoint

13 Spatial Types – OGC Simple Features for SQL (*)

14 Operations OGC Simple Feature Types

15 OGC Simple Features for SQL 1 (*)  The OGC SF (roughly AKA ISO 19125-1) describes 2-D geometry with linear interpolation between vertices. The simple feature model consists of a root class Geometry and its subclasses Point, Curve, Surface, GeometryCollection. The class Geometry collection has the subclasses Multipoint, Multicurve, MultiSurface.

16 OGC Simple Features for SQL 1 (*)  The OGC does not include complexes, a third dimension, non-linear curves, `networking or topology (i.e. connectivity information).  Because of it relative simplicity and its support in both the commercial & open source community SFSQL is widely used in DBMS and is supported in many Web applications.  It is expected that newer more sophisticated standards such as ISO-19107 will gradually replace OGC SF.

17 OGC Simple Features for SQL (*)  Brief description  A simple feature is defined to have both spatial and non-spatial attributes. Spatial attributes are geometry valued, and simple features are based on 2D geometry with linear interpolation between vertices. Each feature is stored as a row in a database table. This course covers the OGC: GEOMETRY type with subtypes such as POINT, LINE, POLYLINE, POLYGON, and collections of these.

18 OGC Simple Features for SQL (*)  Functionality can be described under the following headings. 1. Basic Methods on Geometry 2. Methods for testing Spatial Relations between geometric objects 3. Methods that support Spatial Analysis 4. Geometry Collection

19 OGC Simple Features for SQL (*)  Basic Methods on Geometry  Describes the dimensions and reference system (SRID) of the geometry.  Operations include Dimension, GeometryType,, conversions AsText, AsBinary, tests on geometry include IsEmpty, IsSimple. Operations that return geometry Boundary, Envelope returns bounding box  Methods for testing Spatial Relations between geometric objects  These polymorphic methods check relations on the generic or super class GEOMETRY and usually return a Boolean. Main methods Equals, Disjoint, Intersects, Touches, Crosses, Within, Contains, Overlaps, Relate( testing for intersections between the Interior, Boundary and Exterior of the two geometries)  Methods that support Spatial Analysis  A set of geometric and ‘metric’ methods. Methods calculate distances and areas with respect to the spatial reference system of this Geometry. Methods include Distance, Buffer, ConvexHull, Intersection, Union, Difference, SymDifference.  Geometry Collection  A GeometryCollection is a geometry that is a collection of 1 or more geometries. All the elements in a GeometryCollection must be in the same Spatial Reference. Subclasses of GeometryCollection may restrict membership based on dimension and may also place other constraints on the degree of spatial overlap between elements. Methods  NumGeometries( ):Integer—Returns the number of geometries in this GeometryCollection.  GeometryN(N:integer):Geometry—Returns the Nth

20 OGC Spatial Relations  Equals – same geometries  Disjoint – geometries share common point  Intersects – geometries intersect  Touches – geometries intersect at common boundary  Crosses – geometries overlap  Within– geometry within  Contains – geometry completely contains  Overlaps – geometries of same dimension overlap  Relate – intersection between interior, boundary or exterior

21 Contains Relation For the base geometry to contain the comparison geometry it must be a superset of that geometry. Does the base geometry (small circles) contain the comparison geometry (big circles)? Geographic Information Systems and Science,,Longley,Goodchild,Maguire,Rhind

22 Touches Relation Two geometries touch when their boundaries intersect. Raise deep mathematical issues e.g. what is the boundary of a point?, what about tolerance + or - a metre? Does the base geometry (small circles) touch the comparison geometry (big circles) ? Geographic Information Systems and Science, Longley,,Goodchild,,Maguire,Rhind

23 Spatial Methods  Distance – shortest distance  Buffer – geometric buffer  ConvexHull – smallest convex polygon geometry  Intersection – points common to two geometries  Union – all points in geometries  Difference – points different between two geometries  SymDifference – points in either, but not both of input geometries

24 Convex Hull The convex hull of a set of points is the intersection of all convex sets which contain the points. A set of points is convex if and only if for every pair of points p,q in S, the line segment pq is completely contained in S. Left is convex set and right non-convex set Convex hulls constructed around objects.

25 Operations on themes (*)  Theme projection 1 : ‘selecting’ some attributes from the countries theme. Get the Population and Geometry of European countries  Theme selection: Name and population of European countries with a population of 50 million or more.  Theme union : European countries with population less than 10 million joined with those over 10 million.  Theme overlay: See example  Theme merge : See example

26 Operations on themes (*)  Theme overlay 1 : Generates a new theme and new geometry from the overlaid themes. We get the geometric intersection of spatial objects with the required themes. See European language example.  Theme merge : The merge operation performs the geometric union of the spatial part of n geographic objects that belong to the same theme under a constraint condition supplied by the user. See East/West Germany example.

27 Projection on Theme (*) Find the countries of western Europe with population greater than 50 million. This is a projection on the attribute population. Unlike a conventional database query we often want the query result and the original context, in this case Europe.

28 Theme Merge (*) Merging two geographic objects in a selected theme (say country) into a single object.

29 Theme Overlay (*) The lower map represents the overlay of European countries and languages. Latin languages Anglo-Saxon

30 Indexing  Indexing is used to speed up queries and locate rows quickly  Traditional RDBMS use 1-d indexing (B-tree)  Spatial DBMS need 2-d, hierarchical indexing  Grid  Quadtree  R-tree  Others  Multi-level queries often used for performance (MBR)

31 R-tree Examples of R – Tree Index of polygons

32 Study Area Minimum Bounding Rectangle Minimum Bounding Rectangles

33 Spatial enabled DB Summary Spatial enabled DB Summary  Database – an integrated set of data on a particular subject. Can include spatial and non- spatial and possible temporal.  Databases offer many advantages over files  Relational databases dominate for non-spatial use, object-relational databases (ORDBMS) often used for spatial data.  Databases address some limitations for GIS

34 Choice database for GIS?  Choice of DBMS:  Commercial (Oracle, DB2) or  Open source (PostgreSQL, MySQL).  We will PostgreSQL with PostGIS spatial extensions.  PostgreSQL is an Object Relational Database System (ORDBMS).

35 Database Architecture for GIS(*)  Pure Relational Approach  Spatial data can be stored in a pure RDBMS. The coordinates for the spatial data can be stored in tables. Uses existing technologies, requires no additional software (for the pure DBMS perspective).  Drawbacks  It is difficult to represent and query complex spatial structures (such as a polygon with holes) or topological relationships ( network connectivity, polygon adjacency). No ordered lists.  Violates independence principle, user must know about data storage. Change of geometric representation requires deep reorganization of the database and query formulation.  Poor performance, requires a lot of processing of the relational tuples that represent the spatial information.  Lack of user friendliness because users have to manipulate tables of points.  Difficulty of defining new spatial types.  The impossibility of expressing geometric computations such as adjacency tests, point query, or window query.

36 Database Architecture for GIS(*)  Loosely Coupled  Many current commercial such as ArcInfo use this approach.  Uses a RDBMS to store 'attribute' or descriptive information e.g. the name of a road not its geometry.  A specific module for spatial data management.  Drawbacks  The coexistence of heterogeneous data models, which implies difficulties in modeling use and integration.  A partial loss of DBMS techniques e.g. recovery, querying, optimization.

37 Database Architecture for GIS(*)  Loosely Coupled Architecture Database Files Application Programs Relational DBMS Geometric Processing (GIS)

38 Database Architecture for GIS(*)  Integrated  DBMS extensibility address many of the problems inherent in the RM and LC approaches.  Most commercial databases that offer facilities to handle spatial data (PostgreSQL, Oracle, DB2) take this approach. The basic idea is to add new types and operations to the RM as follows:  The query language is extended to manipulate spatial data as well as descriptive data. New spatial types (point, line, and polygon) are handled as basic types by the DBMS.  Many other DBMS functions such as query optimization, are adapted in order to handle geo-spatial data efficiently.  Drawback: Does not provide full GIS functionality (cartography). We must use additional software such as Geoserver to make (or render) an attractive map from the raw vectors stored in the DBMS.

39 What can PostGIS do? Many PostGIS functions available via SQL Compliant with OGC 1 Simple Features Specification Coordinate transformation IdentifyBufferTouchesCrossesWithinOverlapsContains CrossesWithinOverlapsContainsAreaLength Point on surface Return geometry as SVG

40 What can PostGIS do? PostGIS supports a geometry type which is compliant with the OGC standard for Simple Features.  POINT( 50 100 )  LINESTRING ( 10 10, 20 20 )  POLYGON ( ( 0 0, 5 5, 5 0, 0 0 ) )  MULTIPOINT ( ( 1 1 ), ( 0 0 ) )  MULTILINESTRING ( … )  MULTIPOLYGON ( … )

41 Spatial Database Features Oracle DB2 InformixPostGISMySQL Spatial Objects R-Tree Index Spatial Functions OpenGIS Coord Transform Spatial Aggregates

42 HOW Spatial Databases Fit into GIS Web Client Internet Other GIS LAN Editing Loading Analysis GIS Mapping Features Database Image from Paul Ramsey Refractions Research

43 ProstgreSQL  PostgreSQL itself provides the main features of a RDBMS. Includes other advanced features such as:  Inheritance  Functions  Constraints  Triggers  Rules  Transactional integrity  Permits an ‘OO like’ style of programming

44 PostgreSQL/PostGIS namecityhrsstatusst_fedthe_geom Brio RefiningFriendswood50.38activeFed SRID=32140;POINT(968024.87474318 4198600.9516049) Crystal ChemicalHouston60.9activeFed SRID=32140;POINT(932279.183664999 4213955.37498466) North CavalcadeHouston37.08activeFed SRID=32140;POINT(952855.717021537 4223859.84524946) Dixie Oil ProcessorsFriendswood34.21activeFed SRID=32140;POINT(967568.655313907 4198112.19404211) Federated MetalsHouston21.28activeState SRID=32140;POINT(961131.619598681 4220206.32109146) The data is stored in a relatively simple format with the attributes and geometry stored in a single table. Spatial reference number Data type Coordinates Attribute Data

45 How does it work?  Spatial data is stored using the coordinate system of a particular projection.  That projection is referenced with a Spatial Reference Identification Number ( SRID )  This number relates to another table ( spatial_ref_sys ) which holds all of the spatial reference systems available.  This allows the database to know what projection each table is in, and if need be, re- project from those tables for calculations or joining with other tables.

46 Coordinate Projection SRID=3005;MULTILINESTRING((1004687.04355194594291. 053764096,1004729.74799931 594258.821943696)) SRID=4326;MULTILINESTRING((125.934150.364070000000 1,-125.9335 50.36378)) Coordinates of one table can be converted to those of another table. This permits the ‘geometry’ in each table to match. Relatively easy to do in PostGIS

47 Spatial Database Components  The Geometry metadata table table schematable name geometry column coord dimsridtype brazostexas_countiesthe_geom232139MULTIPOLYGON brazostexas_riversthe_geom232139MULTILINESTRING brazostexas_roadsthe_geom232139MULTILINESTRING brazostx_maj_aquifersthe_geom232139MULTIPOLYGON brazostx_min_aquifersthe_geom232139MULTIPOLYGON brazostxzip_codesthe_geom232139MULTIPOLYGON brazosbz_landmarksthe_geom232139POINT

48 spatial_ref_sys  postgis=# \d spatial_ref_sys Table "public.spatial_ref_sys" Column | Type | Modifiers -----------+-------------------------+----------- srid | integer | not null auth_name | character varying(256) | auth_srid | integer | srtext | character varying(2048) | proj4text | character varying(2048) | Indexes: "spatial_ref_sys_pkey" PRIMARY KEY, btree (srid)

49 geometry_columns  postgis=# \d geometry_columns Table "public.geometry_columns" Column | Type | Modifiers -------------------+------------------------+----------- f_table_catalog | character varying(256) | not null f_table_schema | character varying(256) | not null f_table_name | character varying(256) | not null f_geometry_column | character varying(256) | not null coord_dimension | integer | not null srid | integer | not null type | character varying(30) | not null Indexes: "geometry_columns_pk" PRIMARY KEY, btree (f_table_catalog, f_table_schema, f _table_name, f_geometry_column)

50 Database Rules  Rules help prevent human error when modifying a data set  Rules are user defined  Rules are such things as;  “A fire hydrant must be located on a water line”  Rivers should flow down hill.

51 Constraints  Constraints are similar to rules, but are less assertive.  Constraints are provided by the DBMS and are applied by the user  A Constraint would be “Parcel_ID Not Null” - meaning a number ID has to be provided when a parcel is created.

52 Constraints Constraint GIS examples Uniqueness Two spatial objects cannot exist at the same point Non-NullAll Address points must have co-ordinates RangeAll heights in Ireland must be in range -100 to 2000 metres RelationshipEvery river must be connected to the sea, a lake or other river (Can rivers cross/) CardinalityEach side of a triangle has a 1:2 relation with the others InclusionAll counties are polygons CoveringA boundary may be a townland and/or a barony. DisjointednessAll roads must be only a primary or a secondary or a regional Referential IntegrityA county border must be represented by a ground feature GeometricalTriangles must have three sides OrientationRoads are usually to the front of houses TopologicalInner walls must be "inside" buildings GeneralComplex rules built from above constraints

53 Constraints How can we define in front of?

54 Data integrity ValidInvalid select count(*) from bc_voting_areas where not isvalid(the_geom);

55 Dynamic and Static Data  Static non-spatial data is usually maintained in the table with the geometry (e.g. county name). In this case the geometry is considered immutable.  Dynamic non-spatial data is usually maintained in a separate table.  There can be more than one dynamic table for a geometry table.  Dynamic spatial can include moving objects or a changing world (temporal requires different treatment)

56 Joins(*)  Dynamic tables can be joined with the geometry tables for querying purposes  A primary key is used to relate the 2 tables together  A primary key is a unique identifier for each row in a table Primary Key

57 Spatial Join 1 A typical example of spatial join is “Find all pair of rivers and cities that intersect”. The result of join between the set of rivers {R1, R2} and cities {C1, C2, C3, C4, C5} is { (R1, C1), (R2, C5)}.

58 Temporal Queries  Find where (x,y) and when(t) will it snow : Clouds(X, Y, T, humidity) Clouds(X, Y, T, humidity) Region(X, Y, T, temperature) Region(X, Y, T, temperature) (SELECT x, y, t (SELECT x, y, t FROM Clouds FROM Clouds WHERE humidity >= 80) WHERE humidity >= 80) INTERSECT INTERSECT (SELECT x, y, t (SELECT x, y, t FROM Region FROM Region WHERE temperature <= 32) WHERE temperature <= 32)

59 Temporal Example: roads, buildings, and regions Consider a line. From the properties of metric spaces it has a length.

60 Temporal Example: roads, buildings, and regions Lets call it a road. From graph theory we have a path A B

61 Temporal Example: roads, buildings, and regions Lets add a field (F1) with an area and a topology. Purple line segment represents both a road and a fence. F1 A B

62 Example: roads, buildings, and regions Lets add an administrative region (outer red rectangle) and some houses F1 A B

63 Example: roads, buildings, and regions Lets divide the field in two by inserting a new fence. We need to delete the old area and add two new areas. What about adjacency relation between fields? F2F3 A B

64 Example: roads, buildings, and regions Imagine a picture of the world at Time1 and Time2. Not only have some objects changed but some spatial relationships have changed. An addition can induce a deletion and a deletion can induce an insertion. Time1 Time2 F3F2 F1 A A B B

65 Example of temporal queries Is there a route from A to B? (now is assumed) Is there a route from A to B? (now is assumed) Was there a route from A to B in Time1? Was there a route from A to B in Time1? Does the route in query 1 pass through the administrative ? Does the route in query 1 pass through the administrative region? Does the route in query 1 pass touch the administrative ? Does the route in query 1 pass touch the administrative region ? What fields were adjacent to F2 in Time2? What fields were adjacent to F2 in Time2?

66 Geoserver with PostgreSQL/PostGIS Geoserver with PostgreSQL/PostGIS  If you want to use PostgreSQL/PostGIS data on the web the a special software is required (e.g. Geoserver, Mapserver, or Oracle’s Mapviewer)  The Geoserver DATA statement can use arbitrary SQL to compose spatial result sets for mapping.

67 Mapserver with PostgreSQL/PostGIS Mapserver with PostgreSQL/PostGIS  “Map the % voter turnout.” DATA "the_geom from (select gid, the_geom, 100 * vtotal::real / vregist::real as percent from bc_voting_areas) as query using srid=3005 using unique gid"

68 Raster Image Data Not Covered


Download ppt "Spatial Databases: Lecture 1 DT211-4 DT228-4 DT249-4 Semester 2 2009-10 Pat Browne"

Similar presentations


Ads by Google