Lecture 23: Brief Introduction to Data quality By Austin Troy ------Using GIS-- Introduction to GIS.

Slides:



Advertisements
Similar presentations
More Input Methods and Data Quality and Documentation
Advertisements

Center for Modeling & Simulation.  A Map is the most effective shorthand to show locations of objects with attributes, which can be physical or cultural.
Geographic Base Maps; Map Projections; Using MapInfo Help I.Geographic Base Maps A.Definition B.Types 1. TIGER/ Line 2. Cadastral 3. Planimetric II. Map.
From portions of Chapter 8, 9, 10, &11. Real world is complex. GIS is used model reality. The GIS models then enable us to ask questions of the data by.
Return to Outline Copyright © 2009 by Maribeth H. Price 2-1 Chapter 2 Mapping GIS Data.
©2007 Austin Troy Lecture 8: Introduction to GIS 1.Multi-layer vector query operations in Arc GIS 2.Vector Spatial Joining Lecture by Austin Troy, University.
Introduction to GIS Lecture 2: Part 1. Understanding Spatial Data Structures Part 2. Legend editing & choropleth mapping Part 3. Map layouts.
Managing Error, Accuracy, and Precision In GIS. Importance of Understanding Error *Until recently, most people involved with GIS paid little attention.
School of Environmental Sciences University of East Anglia
GIS: The Grand Unifying Technology. Introduction to GIS  What is GIS?  Why GIS?  Contributing Disciplines  Applications of GIS  GIS functions  Information.
Lecture 24: More on Data Quality and Metadata By Austin Troy Using GIS-- Introduction to GIS.
Introduction to GIS Ghassan Mikati, Ph.D GIS Expert.
Lecture 4: Intro to the Vector Data Model and to Map Layout
GIS Overview. What is GIS? GIS is an information system that allows for capture, storage, retrieval, analysis and display of spatial data.
Geographic Information Systems
GIS Geographic Information System
Geog 458: Map Sources and Errors January 20, 2006 Data Storage and Editing.
@2007 Austin Troy Lecture 4: An Introduction to the Vector Data Model and Map Layout Techniques Introduction to GIS By Brian Voigt University of Vermont.
Geog 458: Map Sources and Errors Uncertainty January 23, 2006.
Introduction to GIS ©2008 by Austin Troy. All rights reserved Lecture 5: Symbology Lecture by Austin Troy, University of Vermont.
Lecture 16: Data input 1: Digitizing and Geocoding By Austin Troy University of Vermont Using GIS-- Introduction to GIS.
Lecture 23: Data quality and documentation By Austin Troy Using GIS-- Introduction to GIS.
Fundamentals of GIS Materials by Austin Troy © 2008 Lecture 18: Data Input: Geocoding and Digitizing By Austin Troy University of Vermont NR 143.
Getting the Map into the Computer Getting Started with Geographic Information Systems Chapter 4.
©2005 Austin Troy. All rights reserved Lecture 3: Introduction to GIS Part 1. Understanding Spatial Data Structures by Austin Troy, University of Vermont.
Data Input How do I transfer the paper map data and attribute data to a format that is usable by the GIS software? Data input involves both locational.
GIS Tutorial 1 Lecture 6 Digitizing.
Digitizing There are three primary methods for digitizing spatial information: Manual Methods include: Tablet Digitizing Heads-up Digitizing An Automated.
©2005 by Austin Troy. All rights reserved Lecture 5: Introduction to GIS Legend Visualization Lecture by Austin Troy, University of Vermont.
February 15, 2006 Geog 458: Map Sources and Errors
GIS DATA AND SOURCES. Building Topography Land use Utility Soil Type Roads District Land Parcels Nature of Geography Objects.
Data Acquisition Lecture 8. Data Sources  Data Transfer  Getting data from the internet and importing  Data Collection  One of the most expensive.
@2007 Austin Troy Lecture 4: An Introduction to the Vector Data Model and Map Layout Techniques Introduction to GIS By Brian Voigt University of Vermont.
Data Quality Data quality Related terms:
Fundamentals of GIS Materials by Austin Troy © 2008 Lecture 23: Data Quality and Documentation By Austin Troy University of Vermont NR 143.
Esri UC 2014 | Technical Workshop | Data Alignment and Management in ArcMap Lisa Stanners, Sean Jones.
Spatial data models (types)
Accuracy Assessment. 2 Because it is not practical to test every pixel in the classification image, a representative sample of reference points in the.
Understanding and Interpreting maps
Map Scale, Resolution and Data Models. Components of a GIS Map Maps can be displayed at various scales –Scale - the relationship between the size of features.
Chapter 3 Sections 3.5 – 3.7. Vector Data Representation object-based “discrete objects”
Fundamentals of GIS Materials by Austin Troy © 2008 Lecture 18: Data Input: Geocoding and Digitizing By Austin Troy University of Vermont.
1 1 ISyE 6203 Radical Tools Intro To GIS: MapPoint John H. Vande Vate Spring 2012.
Scale, Resolution and Accuracy in GIS
GIS Data Quality.
Introduction to GIS ©2008 by Austin Troy. All rights reserved Lecture 5: Symbology Lecture by Austin Troy, University of Vermont.
8. Geographic Data Modeling. Outline Definitions Data models / modeling GIS data models – Topology.
How do we represent the world in a GIS database?
Fundamentals of GIS Materials by Austin Troy © 2006 Lecture 9: Data Quality and Input Methods By Austin Troy University of Vermont.
Fundamentals of GIS Materials by Austin Troy © 2008 Lecture 9: More Input Methods and Data Quality and Documentation By Austin Troy University of Vermont.
Data Storage and Editing (17/MAY/2010) Dr. Ahmad BinTouq URL:
GIS Data Structures How do we represent the world in a GIS database?
NR 143 Study Overview: part 1 By Austin Troy University of Vermont Using GIS-- Introduction to GIS.
When you begin a project, a reference data layer is placed on the map first. This initial layer(s) is called the base map. There are different types of.
Distance measure Point A: UTM Eastings = 450,000m; Northings = 4,500,000m Point B: UTM Eastings = 550,000m; Northings = 4,500,000m.
Spatial Data Models Geography is concerned with many aspects of our environment. From a GIS perspective, we can identify two aspects which are of particular.
CENTENNIAL COLLEGE SCHOOL OF ENGINEERING & APPLIED SCIENCE VS 361 Introduction to GIS ERROR, ACCURACY & PRECISION COURSE NOTES 1.
1 National Standard for Spatial Data Accuracy Missoula GIS Coffee Talk and MT GPS Users Group Julie Binder Maitra March 17, 2006.
MAP SCALE Sizing the Model. Map Scale Ratio of a single unit of distance on map to the corresponding distance measured on the surface of the ground Gives.
Geocoding Chapter 16 GISV431 &GEN405 Dr W Britz. Georeferencing, Transformations and Geocoding Georeferencing is the aligning of geographic data to a.
Geocoding Chapter 16 GISV431 &GEN405 Dr W Britz. Georeferencing, Transformations and Geocoding Georeferencing is the aligning of geographic data to a.
Chapter 13 Editing and Topology.
Data Quality Data quality Related terms:
Attend to Precision Introduction to Engineering Design
GEOGRAPHICAL INFORMATION SYSTEM
Data Queries Raster & Vector Data Models
URBDP 422 URBAN AND REGIONAL GEO-SPATIAL ANALYSIS
Precision & Uncertainties
Geographic Information Systems
Presentation transcript:

Lecture 23: Brief Introduction to Data quality By Austin Troy Using GIS-- Introduction to GIS

©2005 Austin Troy Data Quality Two key components of quality in data are accuracy and precision Error is a result of both inaccuracy and imprecision in the data; it is a general term encompassing lack of reliability GIS data quality is, in theory, a compromise between needs and costs In practice it is usually about what is available Introduction to GIS

©2005 Austin Troy Data Quality Cost of data is a reflection of that precision: Because lower-quality data tend to be cheaper and more available, a very common problem in GIS is the inappropriate use of data A critical step in developing a GIS is deciding “what is accurate enough?” This is function of needs, cost, accessibility and time User needs determine accuracy and, in general, accuracy determines price Introduction to GIS

©2005 Austin Troy Accuracy What is accuracy? “the degree to which information on a map or in a digital database matches true or accepted values.” From Kenneth E. Foote and Donald J. Huebner It is also a reflection of how close a measurement represent the actual quantity measured Accuracy is a reflection of the number and severity of errors in a dataset or map. Introduction to GIS

©2005 Austin Troy Precision Quality is also a function of “precision” Precision is the intensity or level of preciseness, or exactitude in measurements. The more precise a measurement is, the smaller the unit which you intend to measure Hence, a measurement down to a fraction of a cm is more precise than a measurement to a cm However, data with a high level of precision can still be inaccurate—this is due to errors Each application requires a different level of precision Introduction to GIS

©2005 Austin Troy Precision Each application requires a different level of precision Engineering and surveying applications typically require highly levels of precision; they may be measuring to a millimeter On the other end of the spectrum, studies of weather patterns, or crop cover require much less precision Precise data are costly: for example carefully surveyed point locations needed by utilities to record the locations of pumps, wires, pipes and transformers cost $5-20 per point to collect Introduction to GIS

©2005 Austin Troy Positional Accuracy and Precision One of the primary types of error in GIS is positional error—that is, errors in 2D (x,y) and in the 3 rd dimension (height) Positional accuracy and precision are functions of the scale at which the digital layer was created If created from digitizing a paper map, the minimum usable scale of the digital layer is considered the scale of that map Scale is a function of the map’s resolution Introduction to GIS

©2005 Austin Troy Positional Accuracy Positional accuracy standards specify that acceptable positional error varies with scale Data can have high level of precision but still be positionally inaccurate Positional error is inversely related to precision and to amount of processing Introduction to GIS

©2005 Austin Troy Measurement of Accuracy Accuracy is often stated as a confidence interval: e.g cm +/-.01 means true value lies between and One of the key measurements of positional accuracy is root mean squared error (MSE); equals squared difference between observed and expected value for observation i divided by total number of observations, summed across each observation i This is just a standardized measure of error—how close the predicted measure is to observed Introduction to GIS

©2005 Austin Troy Positional Error Different agencies have different standards for positional error Example: USGS horizontal positional requirements state that 90% of all points must be within 1/30th of an inch for maps at a scale of 1:20,000 or larger, and 1/50th of an inch for maps at scales smaller than 1:20,000 Introduction to GIS

©2005 Austin Troy Positional Error USGS Accuracy standards on the ground: 1:4,800 ± feet 1:10,000 ± feet 1:12,000 ± feet 1:24,000 ± feet 1:63,360 ± feet 1:100,000 ± feet Introduction to GIS See image from U. Colorado showing accuracy standards visually Hence, a point on a map represents the center of a spatial probability distribution of its possible locations probability distribution Thanks to Kenneth E. Foote and Donald J. Huebner, The Geographer's Craft Project, Department of Geography, The University of Colorado at Boulder for links

©2005 Austin Troy Positional Error A critical point is to remember that “zooming” in a digital map does not increase the level of accuracy The accuracy and precision are based on the scale of the digital layer’s original parent source To see this, let’s look at river data derived from sources at three scales and three levels of precision 1:2,000,000- small scale 1:100,000- medium scale 1:24,000-large scale Introduction to GIS

©2005 Austin Troy Positional Error-some examples Introduction to GIS

©2005 Austin Troy Attribute Precision Attribute accuracy and precision refer to quality of non-spatial, attribute data Precision for numeric data means lots of digits Example: recording income down to cents, rather than just dollars Precision for categorical data means lots of categories Example: Anderson LU level 3 versus level 1 Introduction to GIS

©2005 Austin Troy Conceptual Accuracy Misclassification result from differences in judgment or in the automated classification tools The accuracy of classifications will depend on the precision. The less precise your classifications, the less likely there will be errors If just classifying as “land and water”, that is not very precise, and not likely to result in an error Introduction to GIS

©2005 Austin Troy Other measures of data quality Logical consistency Completeness Data currency/timeliness Accessibility These apply to both attribute and positional data Introduction to GIS

©2005 Austin Troy Logical Consistency Do data follow rules of logic? Attribute Example: is something classified as both water and as commercially zoned land? Geospatial example: Do lines intersect when they should not (eg. With power lines)? Do polygons not close on themselves Introduction to GIS

©2005 Austin Troy Completeness Is a data layer complete or lacking in coverage? Examples: does a layer on roads leave out some roads? If so, does it do so systematically or randomly? Does a database of buildings in a city leave out some buildings? Examples where completeness is crucial: a database of houses used to notify neighbors when a noxious facility is proposed? Imagine if a bunch of people were left out? Introduction to GIS

©2005 Austin Troy Currency and Timeliness Since some things change faster than others, the importance of timeliness in data depends on what is being displayed By the time they have been digitized, they are often out of date ; e.g. tax parcels Updates are key, but the frequency of updates should depend on what is being displayed. Temporal validity must be stated: this tells someone using a map how long the data are considered valid Introduction to GIS

©2005 Austin Troy Currency and Timeliness Introduction to GIS

©2005 Austin Troy Currency and Timeliness Introduction to GIS Streets are another data set where currency is important; blue represents all the additional streets built between 1990 and 2000

©2005 Austin Troy Conflation When one layer is better in one way and another is better in another and you wish to get the best of both Way of reconciling best geometric and attribute features from two layers into a new one Very commonly used for case where one layer has better attribute accuracy or completeness and another has better geometric accuracy or resolution Also used where newer layer is produced for some theme but is has lower resolution than older one Introduction to GIS

©2005 Austin Troy Two general types of Conflation Attribute conflation: transferring attributes from an attribute rich layer to features in an attribute poor layer Feature conflation: improvement of features in one layer based on coordinates and shapes in another, often called rubber sheeting. User either transforms all features or specifies certain features to be kept fixed Introduction to GIS

©2005 Austin Troy Attribute conflation More spatially accurate layer is referred to as the base, coordinate or target layer Layer with more accurate attribution is referred to as the reference, or non-base layer TIGER line files: good attribution, poor accuracy; USGS DLGs: opposite. Attribute conflation is frequently used by third party vendors to assign the rich attribute data of TIGER to the positionally accurate DLGs. Nodes are matched by iteratively rubber sheeting the reference layer to the base layer until matching nodes fall within certain tolerance. Then line features are matched up. Introduction to GIS

©2005 Austin Troy Conflation examples Introduction to GIS Source: Stanley Dalal, GIS cafe