Presentation on theme: "Spatial Databases: Lecture 3 DT249 Semester 2 2014-15 Pat Browne."— Presentation transcript:
Spatial Databases: Lecture 3 DT249 Semester Pat Browne
Outline We will look in more detail on what happens when a spatial table is constructed in PostgreSQL/PostGIS. We will describe how construct a table for a sub-set of the historical data set 1. We take a closer look at a range of OGC queries that can be used in PostGIS 2. A reminder of viewing and querying using OpenJump We will look at map accuracy.
PostgreSQL-PostGIS An object-relational DBMS with PostGIS spatial extensions. Is completely Open Source Compliant with OGS’s Simple Features for SQL Has a spaghetti-like spatial data model Spatial indexing Supports OGC types and PostgeSQL’s ‘native types’: point, line, box, path, polygon, and circle geometric types Topology is under development. Can perform overlay function Simple features are based on 2D geometry with linear interpolation between vertices.
PostGIS levels of representation
PostGIS S/W components PostGiS provides Open Database Connectivity ODBC connectivity. PostGIS includes extensions to the underlying PostgreSQL ODBC drivers which allow transparent access to GIS objects from PostGIS via the ODBC protocol. ODBC connectivity is part of the OGC standard. PostGIS also provides Java Database Connectivity (JDBC), which is not part of the OGC standard. GiST (Generalized Search Tree) provides high speed spatial indexing. PROJ.4 is an open source library that provides coordinate reprojection to convert between geographic coordinate systems. GEOS (Geometry Engine, Open Source) is a library used by PostGIS to perform all the operations in the OpenGIS Simple Features for SQL Specification. The GEOS library is used to provide geometry tests (ST_Touches(), ST_Contains(), ST_Intersects()) and operations (ST_Buffer(), ST_Union(),ST_Intersection() ST_Difference()) within PostGIS.
PostGIS and OGC standard 1. PostGIS implements and is compliant with the OGC’s Simple Features for SQL standard. 2. PostGIS supports all OGC types: Point, Line, Polygon, MultiPoint, MultiLine, MultiPolygon, GeometryCollection and operations on those types 3. PostGIS uses OGC well-known text format on the SQL command-line to represent GIS features.
Creating a table. The basic steps to create a new spatially enabled table are: Create a table with the desired non-spatial attributes. Add a spatial column with as PostGIS/OGC extension AddGeometryColumn Insert the geometry with a SQL insert & select statements.
Creating a spatial table We can also create a spatial table from an existing table. On the following slides we will describe how to make a table containing a subset of the historical data set from National Monuments Service 1. We will make a table with the historical information for Dublin. We assume that the county 2 table exists and that Dublin is a single region in the county table (Dublin consists of four regions). Note a system generated identified ( gid ) is used as the primary key. It is possible to include a geometry column at table creation time, but the system would not generate integrity constraints so we will stick with this method.
Creating a spatial table, step 1 PostgreSQL/PostGIS will respond: NOTICE: CREATE TABLE will create implicit sequence "dublin_historical_gid_seq" for serial column "dublin_historical.gid" NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "dublin_historical_pkey" for table "dublin_historical"
Creating a spatial table, step 1 Examine the table (\d): Table "public.dublin_historical" Table "public.dublin_historical" Column | Type | Modifiers Column | Type | Modifiers gid | integer | not null default gid | integer | not null default nextval('dublin_historical_gid_seq'::regclass) nextval('dublin_historical_gid_seq'::regclass) Reset of data Indexes: "dublin_historical_pkey"PRIMARY KEY,btree (gid)
Creating a spatial table, step 2 Here we add the geometry column, PostGIS will automatically generate integrity constraints. SELECT AddGeometryColumn('public','dublin_histori cal','the_geom','29900','POINT',2); Do not execute this command in the lab, this table already exists. We use an version of the Irish National Grid with SRID = The type of the geometry to be stored in the column must be included, in this case it is POINT. This command accesses geometry_columns system table (details later).
Creating a spatial table, step 2 First system generated constraint ALTER TABLE dublin_historical DROP CONSTRAINT enforce_dims_the_geom; ALTER TABLE dublin_historical ADD CONSTRAINT enforce_dims_the_geom CHECK (ndims(the_geom) = 2); ADD CONSTRAINT enforce_dims_the_geom CHECK (ndims(the_geom) = 2);
Creating a spatial table, step 2 Second system generated constraint ALTER TABLE dublin_historical ADD CONSTRAINT enforce_geotype_the_geom CHECK (geometrytype(the_geom) = ‘POINT'::text OR the_geom IS NULL);
Creating a spatial table, step 2 Third system generated constraint ALTER TABLE dublin_historical ADD CONSTRAINT enforce_srid_the_geom CHECK (srid(the_geom) = 29900);
Creating a spatial table, step 2 The Primary Constraint was created in step1 CONSTRAINT dublin_historical _pkey PRIMARY KEY(gid);
Creating a spatial table, step 3 Next we insert the data from the all Ireland historical table into the newly created dublin_historical table. Only data is contained in Dublin is inserted into the new table. Check the OGC & PostGIS documentation on the contains predicate AKA ‘a spatial relationship function’.
Creating a spatial table, step 3 INSERT INTO dublin_historical (rmp_prop,map_symbol, entity_id, co_id,smr_val0, nat_grid_e“,class_desc,nat_grid_n,objectid, townlands,scope_n1, smrs, the_geom) SELECT rmp_prop, map_symbol, entity_id, co_id,smr_val0, nat_grid_e, class_desc, nat_grid_n, objectid, townlands, scope_n1, smrs, h.the_geom FROM county c, historical h WHERE contains(c.the_geom,h.the_geom) AND c.name = 'Dublin';
PostGIS system tables The next few slides describe the built-in PostGIS meta-tables that provide the spatial functionality. We only outline the main features. For further details, please see the PostGIS manual.
geometry_columns table Column | Type |Modifiers f_table_catalog | character varying(256) | not null f_table_catalog | character varying(256) | not null f_table_schema | character varying(256) | not null f_table_schema | character varying(256) | not null f_table_name | character varying(256) | not null f_table_name | character varying(256) | not null f_geometry_column | character varying(256) | not null f_geometry_column | character varying(256) | not null coord_dimension | integer | not null coord_dimension | integer | not null srid | integer | not null srid | integer | not null type | character varying(30) | not null type | character varying(30) | not nullIndexes: "geometry_columns_pk" PRIMARY KEY, btree (f_table_catalog, f_table_schema, f "geometry_columns_pk" PRIMARY KEY, btree (f_table_catalog, f_table_schema, f _table_name, f_geometry_column) This table allows PostgreSQL/PostGIS to keep track of actual user spatial tables.
spatial_ref_sys table Displaying a spherical earth on a flat surface requires a projection. This table uses a standard numbering, called the EPSG 1, to describe various projections. Using PostgreSQL’s expanded display we can examine the details for a particular projection representing the Irish National Grid: \x select * from spatial_ref_sys where srid=29900;
spatial_ref_sys table \d spatial_ref_sys Column | Type | Modifiers Column | Type | Modifiers srid | integer | not null srid | integer | not null auth_name | character varying(256) | auth_name | character varying(256) | auth_srid | integer | auth_srid | integer | srtext | character varying(2048) | srtext | character varying(2048) | proj4text | character varying(2048) | proj4text | character varying(2048) |Indexes: "spatial_ref_sys_pkey" PRIMARY KEY, btree (srid) "spatial_ref_sys_pkey" PRIMARY KEY, btree (srid)
OGC – Metadata tables FOSS Relational Database and GeoDatabase Part III Marco Ciolli, Fabio Zottele :
Valid Geometry? select IsValid(the_geom) from lakes; select IsValid(the_geom) from lakes; NOTICE: Self-intersection at or near point NOTICE: Self-intersection at or near point NOTICE: Self-intersection at or near point isvalid isvalid t t f f f t t t
Finding the centre point of a county To find the centre point of each county: select name, asText(Centroid(the_geom)) from county;
Finding a bounding box To find the bounding box of each county: select name, asText(envelope(the_geom)) from county ; Or select name,extent(the_geom) from county group by name; Must use ‘ group by ’ which can handle multiple polygons, also extent returns the geometry as text, so no need for asText.
Finding the dimensions of a bounding box select name, extent(the_geom) from county where name= 'Dublin County Borough' group by name ; Returns Dublin County Borough | BOX( ,325443…) Dublin County Borough | BOX( ,325443…) Using the above, we can measure the dimensions of the bounding box: select distance( geomFromText(‘Point….)’),geomFromText(‘Point….)’));
Point in Bounding Box We can use a non-OGC function && to find objects. SELECT name FROM county where GeomFromText('POINT( )', 29900) && the_geom; See result on next slide. When constructing a query it is important to remember that only the bounding-box-based operators such as && can take advantage of the PostgreSQL’s GiST spatial index.
Result Point in Bounding Box SELECT name, asbinary(the_geom) FROM county WHERE GeomFromText( 'POINT( )', 29900) && the_geom;
Point in Bounding Box
SELECT asbinary(a.the_geom), a.name FROM county a, county b WHERE (b.the_geom && a.the_geom ) and b.name like 'Meath';
Find things near a point select townlands,class_desc,scope_n1, asbinary(the_geom) from dublin_historical WHERE distance(the_geom, GeomFromText('POINT( )', 29900)) < 100; When we submit such a query we should have some idea of the type of data in the result set. To find out the expected return type use PostGIS manual or the OGC standard for SQL.
Find things near a point SELECT c.name, h.townlands FROM county AS c,dublin_historical AS h WHERE distance(h.the_geom, GeomFromText('POINT( )', 29900)) < 1000 and c. Name = 'Dublin County Borough'; Do we know the type of data in the result set here?
Find things near a point SELECT c.name, h.townlands FROM county AS c, dublin_historical AS h st_dwithin WHERE st_dwithin(h.the_geom, PointFromText('POINT( )', 29900),1000) and c. Name = ' Dublin County Borough '; st_dwithin Similar to previous query but using st_dwithin.
Find things near a point SELECT gid,townlands, distance(the_geom, GeomFromText('POINT( )',29900)) FROM dublin_historical WHERE st_dwithin(the_geom, GeomFromText('POINT( )',29900),500) ORDER BY distance(the_geom, GeomFromText('POINT( )',29900)) LIMIT 100;
Finding the largest county in Ireland SELECT name, area(the_geom)/10000 AS hectares FROM county ORDER BY hectares DESC LIMIT 1; Leaving out Northern Ireland SELECT name, area(the_geom)/10000 AS hectares FROM county WHERE name != 'Northern Ireland‘ ORDER BY hectares DESC LIMIT 1;
Stored and calculated areas Area in square KMs SELECT name, area(the_geom)/ AS Calculated, area_km2 AreaStored FROM county ;
What is the length of roads fully contained within Dublin County Borough? SELECT c.name, sum(length(r.the_geom))/1000 as roads_km FROM roads AS r, county AS c WHERE r.the_geom && r.the_geom AND contains(c.the_geom,r.the_geom) AND c.name = 'Dublin County Borough' GROUP BY c.name ORDER BY roads_km;
What historical objects are near Dublin roads? SELECT townlands FROM dublin_historical h, roads r WHERE distance(h.the_geom,r.the_geom) < 200;
Overlays We should distinguish the overlay operation and tests for overlap. The OGCSFSQL use two similar keyword. Table-on-table overlays are possible with the ST_Intersection() function ST_Intersects(a,b) returns BOOLEAN ST_Intersection(a,b) returns GEOMETRY ST_Intersects(a,b)=TRUE | FALSE ST_Intersection()=GEOMETRY
Efficiency of Search 1 It is expensive to process the exact geometry of an object. Therefore approximations such as bounding boxes (BB) or a convex s (CH) are used to help to examine candidate objects and decided whether a candidate fulfils the query or not. The are used as ‘geometric filter’
Efficiency of Search 1
Network Queries In order to execute network queries, we need to augment the spatial information used in the OGC Simple Features for SQL standard. We will use pgRouting 1.
Network Queries 1 Shortest Path from Dublin to Waterford
What two houses within 500 meters of the Chester Beatty Library have the most residents? SELECT b1.residents FROM buildings_geodir b1, buildings_geodir b2 WHERE b2.name = 'CHESTER BEATTY LIBRARY' and ST_DWithin(b2.the_geom,b1.the_geom, 500) ORDER BY residents DESC LIMIT 2; Note residents column will need to be added to the buildings_geodir table
Counties that have exactly 1 neighbour SELECT c1.name FROM county c1, county c2 WHERE touches(c1.the_geom, c2.the_geom) = 'TRUE' GROUP BY c1.name HAVING count(c2.name) =1;
Proportions SELECT saps_label,primary_degree10_4,(male1_1 + female1_1) as Pop, ((cast (primary_degree10_4 as float)) / (cast ((male1_1 + female1_1) as float)) * 100) as Grad_percent FROM dublin_eds WHERE primary_degree10_4 IS NOT NULL ORDER BY Grad_percent DESC LIMIT 1; DESC LIMIT 1; saps_label | primary_degree10_4 | pop | grad_percent saps_label | primary_degree10_4 | pop | grad_percent Pembroke West A | 455 | 4262 | Pembroke West A | 455 | 4262 | (1 row) Note table & column names may differ. Check with \d tableName
Accuracy You should calculate the area of county Meath and compare the result to the area stored in the county table, you should have get two different figures for the area of Meath. Which area is correct? To answer this question we would need to know the accuracy of the area stored in the database and the accuracy the map that we used to calculate the area.
Accuracy Obviously no spatial database or GIS can increase the positional accuracy of a spatial dataset. The accuracy does not change as the viewing scale changes. So we need to know the scale of the original survey and the expected accuracy at that scale.
Accuracy The accuracy of a map is dependent on the differences between the true position of features and their representative position in the map. To find the true position requires highly accurate devices such as industrial grade Global Positioning Systems (GPS) and sophisticated mathematical software.
Accuracy A common of way of defining positional accuracy for maps is to place limits on the root mean square error (RMSE) for individual position components (the X,Y and possibly Z i.e. height) The RMSE is derived from the square root of the average of the squared discrepancies when compared to a higher level independent survey. The RMSE is normally defined in terms of ground scale errors (e.g. +- one metre).
Accuracy RMSE can be viewed as one criteria for positional accuracy. It is measured with respect to some more precise truth (often GPS). RMSE is a measure of absolute accuracy, with respect to a more precise framework. Together with an acceptable RMSE a map must be consistent within itself. That is the various components must fit together. This is a measure of relative accuracy.
Accuracy If data has a nominal scale of say 1:250,000, it still may be more precise than the rough guidelines on slide 51 would indicate. For example it could be derived from more accurate maps e.g. 1:50,000. Without meta-data it is difficult for humans or computers to know a map’s accuracy.
Accuracy The following table shows the expected absolute and relative accuracy values for well defined points within each accuracy category. The relative values apply up to the stated maximum measured distances quoted in the table.
Accuracy From The British Ordnance Survey (OSGB)