Presentation is loading. Please wait.

Presentation is loading. Please wait.

GIS Data Preparation and Integration

Similar presentations


Presentation on theme: "GIS Data Preparation and Integration"— Presentation transcript:

1 GIS Data Preparation and Integration
Digesting the Food

2 Data Preparation and Integration: the necessary steps
Geocoding: assigning geographic coordinates to points Perhaps the most basic form of spatial data entry data media conversion scanning digitizing data format conversion raster & vector data reduction Topology, error detection and topological editing rectification and registration (one on top of the other) overlaying sheets and referencing to the real world edge matching & image adjustment (side by side) linking & balancing adjacent sheets interpolation conflation

3 Geocoding:assigning spatial coordinates to point data
Address Matching assigns spatial coordinates (explicit location) to addresses (implicit location) Address matching requires street network file with street attribute information (street name and number range) for all street segments (block sides) “Zone” variable required if data spans multiple cities (to handle duplicated street names) precise matching of street names can be problematic completeness (esp. for ‘new’ streets) important PO boxes, building names, and apartment complex names cause problems. Implementation in ArcGIS is 3-step process In ArcToolbox (9.2), process street network file to create a Geocoding Service In ArcMap, load appropriate geocoding service via Tools/Geocoding/Services Manager In ArcMap, geocode a table of addresses using Tools/Geocoding/Geocode Addresses Point Location Files containing lat/long or x,y coordinates (e.g derived via GPS) bring table (e.g. in .csv or .dbf format) into ArcGIS using add data icon Right click table name in T of C and select Display X,Y data Displays as “event layer.” Export to shapefile or gdb feature class for spatial data set. Input table must contain 3 variables at minimum: Feature ID, x, y

4 Data Media Conversion--Scanning: automated recording of map or aerial
Produces “dumb” raster data vectorize using conversion software Create “smart” image using digital image processing techniques electromechanical $100-$50,000 instruments drum or flatbed scan resolution depends on price! down to 20 microns (millionth of m) Scanners v. sensors Sensors collect data directly in digital form (e.g. digital cameras) Sensor resolution now (2005>) matches that of photos, so scanning photos becoming old technology Still lots of paper maps around e.g. property ownership records Great if need only raster representation Automated creation of vector data from scanning very problematic: docs must be clean complex line work adds error lines shouldn’t be broken with text. text may be interpreted as lines automatic feature detection (road versus railroad) difficult ESRI’s ArcScan for ArcGIS (included with ArcEditor) provides interactive, semi-automated raster to vector conversion. Other vendors offer specialized conversion software Digital image processing techniques used to create “smart raster” Identify feature type within each raster

5 Data Media Conversion--Digitizing: manually tracing a map or aerial
Applied to map or aerial photo Use hard copy map/photo on table/tablet, or scanned image on screen (heads-up digitizing) pen or cursor detects x, y coords coordinates are in inches/cms from lower left (0,0) control points (tic marks) relate digitized coordinates to real world lat/long coordinates coordinates captured in stream or point mode accuracy of table (but not user!) usually better than 0.1 mm all nodes and polygons should be marked and numbered first essentially a vector approach Problems: paper maps unstable crease and fold stretch with humidity ( up to 3%) photos more stable (0.2%) map errors transferred to GIS maps often prepared for display not accuracy human hand very shaky often generates undershoots, overshoots, & double lines editing and clean-up essential

6 Data Format Conversion:
vector raster Vector raster 4 possibilities Vector to Vector e.g. whole polygon (e.g SAS map data) to point/arc/polygon computationally intense no accuracy loss providing data is ‘clean’ perfectly transitive raster to raster may involve resampling (see under data reduction) may involve conversion between different vendor’s raster formats (e.g. GRID to BIL) vector to raster: point node x,y assigned to closest raster cell locational shift almost inevitable; error depends on raster size. two points in one cell indistinguishable not transitive; cannot retrieve original data without error vector to raster: line cells assigned if touched by line stair step appearance of diagonal lines (called aliasing) can be visually improved through anti aliasing: brightness of cells varied based on fraction of cell covered by the line raster to vector by far the most difficult Transitive: the ability to reproduce the original data after conversion.

7 Vector to Raster Conversion
Point Orthogonal Line Diagonal Line (more problemmatic) Vector Note the use of anti-aliasing to improve line’s visual appearance Raster

8 Raster to Vector Data Conversion: 3-step process
skeletonizing (or thinning): to reduce rasters to unit width peeling approach successively removes outer edges medial axis approach determines set of interior pixels farthest from outer edges vector extraction: to identify lines 4-connected reconstruction joins center points of 4-connected neighbors if present particularly bad for diagonal line reproduction 8-connected reconstruction joins center points of 8-connected neighbors if present diagonal lines reproduced but adds extra lines 8-connected reconstruction with redundancy elimination if 4-connected neighbor line exists, don’t draw diagonal reduces redundant lines topological reconstruction: recreates topological structure create nodes at line junctions construct arcs define polygons (manual designation required) Available via the ArcScan extension for ArcGIS, as well as via several specialized packages from other vendors

9 Raster to Vector Conversion Skeletonizing
For example, go to:

10 Raster to Vector Conversion: Vector Extraction 4-connect reconstruction
search the 4 surrounding cells and join center points if present

11 Raster to Vector Conversion: Vector Extraction 8-connect reconstruction
search the 8 surrounding cells and join center points if present.

12 Raster to Vector Conversion: Vector Extraction 8-connect reconstruction with redundancy elimination
8-connect with redundancy elimination: draw diagonal from 8-cell search only if not already connected by orthogonal from 4-cell search

13 Data Format Conversion Implementation in ArcGIS 9
To Vector To Raster Arctoolbox>Conversion Tools>To Raster> Raster To Other (multiple) Converts one or more raster dataset formats supported by ArcGIS to a GRID, IMAGINE, TIFF, or geodatabase raster dataset format Can also be accomplished thru ArcCatalog, Export function Arctoolbox>Conversion Tools>From Raster> Raster to Point Raster to Polygon Raster to PolyLine Converts raster datasets in GRID, IMAGINE, or TIFF formats to shapefiles or feature classes. Results may not be what you expect! Arctoolbox>Conversion Tools>To Raster> Feature to Raster Converts any shapefile, coverage, or geodatabase feature class containing point, line, or polygon features to a raster dataset Can also be accomplished thru ArcCatalog, Export function. Use ArcCatalog, Export function for conversions between shapefiles, gdb feature classes, coverages and CAD ArcGIS Data Interoperability Extension for the most comprehensive set of conversions From Raster From Vector

14 Data Reduction Why? Thinning (vector data) Resampling (raster data)
conserve space Disk in past Comm. bandwidth today conserve time reduce processing time (batch) speed response time (interactive) Resampling (raster data) ‘average’ the 4 values in a 2by2 neighborhood use this 1 value in a single cell occupying the location of the 4 original cells use mean for interval data; rules required for ordinal or nominal data not transitive! Thinning (vector data) often applied to data digitized in stream mode tolerance elimination: remove nearest-neighbor points which are ‘too close’ (e.g. output device resolution insufficient to distinguish) topological elimination*: remove points unnecessary for topo structure model-based elimination: fit polynomial by least squares and record fewer points along its path 3 2 4 7 16 bytes *Normally uses the Douglas/Poiker (or Peucker) algorithm: David H. Douglas & Thomas K. Peucker Algorithms for the reduction of the number of points required to represent a digitized line or its caricature, Canadian Cartographer, 1973 Implement in ArcGis via Advanced Editing toolbar, Generalize tool 4 bytes 4 1 byte

15 Topology & Errors Topology knowledge about relative spatial positioning --spatial relationships between features and rules about these relationships --managing data cognizant of shared geometry Implies knowledge of the three Cs: connectivity (linked): congruency (coincident/same as/on top of) contiguity (adjacent) It is critical that spatial data be created and managed so that it is topological clean--free from topological errors --editing must always aim to maintain topological structure In topological editing, changes made to one feature (line, polygon, etc.) are also reflected in all other features to which it is connected, coincident, or adjacent In the classic GIS data structure model (as discussed in GIS Data Structures lecture) this implies that, for example --all arcs have nodes at end points --there is a node wherever arcs intersect or connect --a single arc forms the border between contiguous polygons (e.g. Dallas and Tarrant county) --a single arc represents a common boundary (e.g. state and county boundary) Tarrant Dallas

16 Errors: detection and removal
GIS packages commonly use topological structure checking to detect errors Editing based on node snapping used to correct errors: moving a feature so its coordinates correspond exactly with another’s snapping conducted based on tolerances -- snap if within 1 foot, for example Care must always be taken to assure that topological “cleaning” does not itself introduce errors (e.g. snapping nodes and lines together which shouldn’t be)

17 Topological errors or real world occurrences? common problems
dangling arc (node missing at one end) No node at arc intersection (overpass?) Overshoot (or missing node)? undershoot? pseudo node (but perhaps road surface changes) pseudo arc (connects to itself) open polygon Sliver polygon gap

18 How ArcGIS Handles Topology
The original Coverage data model, introduced with ArcInfo in 1981, incorporated topology as a part of the data The CLEAN command checked for, and automatically “fixed”, topological errors based on a set tolerance It could introduce errors into the data The BUILD command then rebuilt polygon structures ArcGIS 8.3 introduced the concept of topological rules for geodatabases in which the topological relationships are stored as a topology feature class separate from the data itself The user can generate an error report, review each error, and then fix it in the data if desired, or mark it as an “exception”

19 Georeferencing: Rectification and Registration providing true earth location/overlaying layers
rectification: rearrangment of location of objects to correspond to a specific reference system (usually geodetic) registration: rearrangment of location of objects of one set so they correspond with those of another, without reference to a specific reference system Despite formal difference, often used interchangeably Two methods homogeneous transformation via rotation, translation, scaling, skewing used for map projection and similar conversions differential transformation via rubber sheeting used to correctly position distorted images or scanned maps or documents Most commonly used to relate images (e.g. scanned photo) to a vector layer, but can also be used to “fix” incorrect positioning of features in a vector layer Implemented in ArcMap: via the Georeferencing toolbar for images via the Spatial Adjustment toolbar for vector layers

20 Transformation: (homogeneous conversion)
translation of origin from digitizer origin for sheet to ‘true’ origin of GIS file rotation of axis e.g to true north scaling of axis homogenous: differential (ovals to circles) skewing of axis Changing map projections may involve all 4 translation differential scaling rotation skewing

21 Rubber Sheeting (differential conversion)
GIS file is differentially ‘stretched’ so that tic points in file overlay corresponding ground control (tie) points on earth’s surface (or tic points in a second file) polynomial fitted by least squares between known ground control coords and tic point coords in GIS “Least squares” minimizes the sum of the squared distances between tic/tie pairs derived parameters then applied to all coordinates in file after conversion, tic points are on average closer to ground control points, but not identical can’t do this with a paper map! --the more the better --well distributed --known lat/long of ground control tie points (usually obtained from GPS) needed for rectification --common identifiable points in each file needed for registration ground control (tie) map locations (tic) GIS file

22 Edge Matching: Joining map sheets to create a seamless GIS
Process required for topo. consistency even if features line-up visually snapping used to connect features Issues acceptable tolerance before ‘further investigation’ of mismatch ‘how far back’ to go on sheet(s) with adjustments for mismatch Causes of mismatch paper map shrinkage/expansion errors from digitizing/scanning georeferencing errors accuracy of equipment extrapolation or round-off errors overlapping map coverage Implement in ArcGIS 9 by: ArcToolbox>Data Management>General>Append (replaces Geoprocessing Tools>Merge in AG 8) combines two (or more) files, but does not link features Spatial Adjustment toolbar, edge match tool links features (after links have been manually identified) Corresponding features fail to match on two sheets: Edge matching in this example would likely require ‘further research’

23 Image Adjustments raster/image data issues
Raster data is made from separate images (photos) or tiles which are mosaiced to produce “seamless image” Collars: must be removed for seamless image Overlap between adjacent images Borders of scanned maps Image Balancing and Feathering: adjusting radiometry for consistent and/or desired image color, brightness, contrast Checker board appearance Abrupt line between adjacent images Brightness levels wash out detail in highly reflective areas, but enhance detail in low reflectance areas Inconsistent signature for same features, especially water as function of wind or sun relative to camera (and is it blue?) Digital Ortho adjustments: Ground control (usually with GPS for visible points) to obtain ‘real world’ location Ground control for camera’s angle relative to ground Camera calibration data to remove lens distortion Digital terrain model (dtm) to remove elevation “distance” (5 mi. on map to mountain top, but 6 mi walking or on photo if mountain is 5,280 feet high!)

24 Collar removal required.

25 Image Balancing/ feathering required

26

27 Tiles After Before 2005 NCTCOG Digital Orthos

28 Interpolation: to create regular spacings from irregular data (e
Interpolation: to create regular spacings from irregular data (e.g creating raster elevation surface from set of point height measurements) estimating values for locations with no data based on: known values, and understanding of spatial behavior of phenomena generally, should assign more importance to closer known values than those further away Estimated values weighting functions average closest n (2?) points ignores distance fit line between closest 2 fit surface between closest 3 trend surface approaches one high order polynomial oscillation a problem finite element approach: fit separate polynomials for each local area kriging: uses correlations of values with distance Implemented in ArcGIS 9 via ArcToolbox>Spatial Analyst Tools>Interpolation

29 Conflation create new master coverage from the best spatial and attribute qualities of two or more source coverages combine multiple coverages into one to simplify support updated data obtained (e.g. new TIGER file) but need to preserve enhancements made to earlier version two groups modify a single file, then need to recreate single version which preserves mods create new master coverage from quality spatial data in one source and quality attribute data in another somewhat narrower definition Depending on the situation, can require application of a variety of processing tools and can be labor intensive: Approaches available within ArcGIS 9 include Spatial Adjustment toolbar, specifically attribute transfer tool ArcToolbox>Analysis Tools>Overlay>Update other add-ins available such as MapMerge from ESEA, Mountain View CA for ArcGIS GIS/T-Conflate for transportation applications

30 NAVSTAR Global Positioning System (gps)
Types of Ground Collection and Corrrection Autonomous Hand-held unit provides 10m accuracy (with SA off) $150-$1,500 per unit WAAS (wide area augmentation system) <3 meter accuracy in practice (spec. is 7m vert/horiz) Base stations (25 across US) monitor satellites 2 master stations (E & W coast) calculate corrections upload to two geosynchronous satellites over equator correction signal broadcast to GPS receivers (no special extra equipment needed unlike DGPS) Began operation June, 1998 To be expanded to cover Canada, Mexico, Panama European EGNO, Asian MSAS under development Differential (DGPS-predecessor to WAAS) accuracy 1-5m depending on equipment/exact method equipment $1,500-$15,000 per receiver correct for SA and other errors via either real time correction signals over FM radio post process with data from Internet Kinematic: high accuracy engineering (within cms); two receivers (base station and rover must lock-on to satellites equipment $15-30K per station use to collect ground control for imagery/orthos or for point/line data (manholes, roads, etc) NAVSTAR Satellite Program 24 (NAVigation Satellite Time and Ranging) satellites in 11,00 mile orbit provide 24 hour coverage worldwide first launched 1978; full system operational December 1993. gps receiver computes locations/elevations via signals from simultaneously visible satellites (minimum 3 for 2-D, 4 for 3-D) Selective Availability (SA) security system 100m accuracy with single receiver, if active 10-15m accuracy if inactive SA turned off May 1st, 2000 Multiple ways to counteract SA Even USCG broadcasted correction signal! Europeans threatened to compete Regional denial of signal possible Russia’s 21-satellite GLONASS (Global Navigation Satellite System) also available.

31 Factors Affecting GPS Accuracy
Ionosphere worst in evening at low altitudes (but ephemerous best there) troposhere especially water vapor which slows signal multipath reflected signals from buildings, cliffs, etc ephemerous position and number of satellites in sky 4 required for 3D (horiz. and vertical), 3 for 2D (no elevation) ideallly, 3 every 120° horizon. with 20° elev., 1 directly above blockage (of satellite signal) by foliage, buildings, cliffs, etc. WAAS signal espec. subject to blocking by terrain & buildings ‘cos is from geostationary equatorial satellite Overall, accuracy better at night than during day.

32 Conclusion Most of the effort in most GIS projects involves data preparation and integration!


Download ppt "GIS Data Preparation and Integration"

Similar presentations


Ads by Google