GIS Data Quality.

Slides:



Advertisements
Similar presentations
The future of navigation in Cambodia Provided by Smarter Solutions © 2010 Smarter Solutions Co., Ltd. All rights reserved. All information contained herein.
Advertisements

Metadata Lecture 7 October 5, Value of Documents Two very similar paintings of circus performers by Picasso from 1904 are put on the auction block;
This power point presentation and its contents are copyright of the Archaeology Data Service and WGK (2002) Recap Mapping and Error Law and IPR Metadata.
Managing Error, Accuracy, and Precision In GIS. Importance of Understanding Error *Until recently, most people involved with GIS paid little attention.
1 CPSC 695 Data Quality Issues M. L. Gavrilova. 2 Decisions…
GIS to EML Workshop Andrews Experimental Forest November 9-10, 2010.
Introduction to Cartography GEOG 2016 E
Lecture 24: More on Data Quality and Metadata By Austin Troy Using GIS-- Introduction to GIS.
Positional Accuracy February 15, 2006 Geog 458: Map Sources and Errors.
Geographic Information Systems
GIS Geographic Information System
NR 422: GIS Review Jim Graham Fall What is GIS? Geographic Information System? Geographic Information Science? A system that provides the ability.
Geog 458: Map Sources and Errors Uncertainty January 23, 2006.
Lecture 23: Data quality and documentation By Austin Troy Using GIS-- Introduction to GIS.
Fundamentals of GIS Materials by Austin Troy © 2008 Lecture 18: Data Input: Geocoding and Digitizing By Austin Troy University of Vermont NR 143.
GEODETIC CONTROL SURVEYS
Getting the Map into the Computer Getting Started with Geographic Information Systems Chapter 4.
Data Input How do I transfer the paper map data and attribute data to a format that is usable by the GIS software? Data input involves both locational.
GIS Tutorial 1 Lecture 6 Digitizing.
Lineage February 13, 2006 Geog 458: Map Sources and Errors.
Week 16 GEOG2750 – Earth Observation and GIS of the Physical Environment 1 Lecture 13 Error and uncertainty Outline – terminology, types and sources –
Spatial data quality February 10, 2006 Geog 458: Map Sources and Errors.
February 15, 2006 Geog 458: Map Sources and Errors
9. GIS Data Collection.
Data Acquisition Lecture 8. Data Sources  Data Transfer  Getting data from the internet and importing  Data Collection  One of the most expensive.
Spatial Analysis University of Maryland, College Park 2013.
Lecture 23: Brief Introduction to Data quality By Austin Troy Using GIS-- Introduction to GIS.
Data Quality Data quality Related terms:
Fundamentals of GIS Materials by Austin Troy © 2008 Lecture 23: Data Quality and Documentation By Austin Troy University of Vermont NR 143.
Accuracy Assessment. 2 Because it is not practical to test every pixel in the classification image, a representative sample of reference points in the.
Applied Cartography and Introduction to GIS GEOG 2017 EL Lecture-3 Chapters 5 and 6.
Data source for Google earth
Data Quality Issues-Chapter 10
GSP 270 Digitizing with an Introduction to Uncertainty and Metadata
ESRI GIS Software. Contents Data Types –ESRI Data Model –Shapefiles –Raster Data Digital Orthophoto Quadrangle Digital Elevation Model Digital Raster.
September 18-19, 2006 – Denver, Colorado Sponsored by the U.S. Department of Housing and Urban Development Using Geographic Information Systems (GIS) as.
Map Scale, Resolution and Data Models. Components of a GIS Map Maps can be displayed at various scales –Scale - the relationship between the size of features.
Chapter 3 Sections 3.5 – 3.7. Vector Data Representation object-based “discrete objects”
Scale, Resolution and Accuracy in GIS
Data input 1: - Online data sources -Map scanning and digitizing GIS 4103 Spring 06 Adina Racoviteanu.
Data Sources Sources, integration, quality, error, uncertainty.
DATA QUALITY AND ERROR  Terminology, types and sources  Importance  Handling error and uncertainty.
Fundamentals of GIS Materials by Austin Troy © 2008 Lecture 9: More Input Methods and Data Quality and Documentation By Austin Troy University of Vermont.
Data Storage and Editing (17/MAY/2010) Dr. Ahmad BinTouq URL:
Uncertainty How “certain” of the data are we? How much “error” does it contain? Also known as: –Quality Assurance / Quality Control –QAQC.
Error & Uncertainty: II CE / ENVE 424/524. Handling Error Methods for measuring and visualizing error and uncertainty vary for nominal/ordinal and interval/ratio.
URBDP 591 A Lecture 17: Mistakes that Scientists Make Objectives Evaluating Empirical Research Learning from Mistakes Mistakes in Research Design Mistakes.
School of Geography FACULTY OF ENVIRONMENT School of Geography FACULTY OF ENVIRONMENT GEOG5060 GIS & Environment Dr Steve Carver
Final Review Final will cover all lectures, book, and class assignments. New lectures since last test are 18 – 26, summarized here. Over half the test.
Tools of the Trade GPS Communication Software GPS Mapping ProgramTarpits and Tips.
1 Overview Finding and importing data sets –Searching for data –Importing data_.
BOT / GEOG / GEOL 4111 / Field data collection Visiting and characterizing representative sites Used for classification (training data), information.
ESRI Education User Conference – July 6-8, 2001 ESRI Education User Conference – July 6-8, 2001 Introducing ArcCatalog: Tools for Metadata and Data Management.
Review.
Data Entry Getting coordinates and attributes into our GIS.
How to describe Accuracy And why does it matter Jon Proctor, PhotoTopo GIS In The Rockies: October 10, 2013.
Point Radius Method Uncertainty and Best Practices.
CENTENNIAL COLLEGE SCHOOL OF ENGINEERING & APPLIED SCIENCE VS 361 Introduction to GIS ERROR, ACCURACY & PRECISION COURSE NOTES 1.
Online Metadata Compilation Tool or ‘ Now You Have No Excuse For Not Creating Metadata!’
1 National Standard for Spatial Data Accuracy Missoula GIS Coffee Talk and MT GPS Users Group Julie Binder Maitra March 17, 2006.
How to Create an Essential Metadata Record Using an Online Tool aka ‘ Now You Have No Excuse For Not Creating.
Geocoding Chapter 16 GISV431 &GEN405 Dr W Britz. Georeferencing, Transformations and Geocoding Georeferencing is the aligning of geographic data to a.
Geocoding Chapter 16 GISV431 &GEN405 Dr W Britz. Georeferencing, Transformations and Geocoding Georeferencing is the aligning of geographic data to a.
26. Classification Accuracy Assessment
Lecture 24: Uncertainty and Geovisualization
Data Quality Data quality Related terms:
Chapter 8 Raster Analysis.
GEOGRAPHICAL INFORMATION SYSTEM
URBDP 422 URBAN AND REGIONAL GEO-SPATIAL ANALYSIS
Geographic Information Systems
Presentation transcript:

GIS Data Quality

Lecture Outline Accuracy Precision Recognizing and Avoiding Error Error Sources Using Multiple Data Sets Together Completeness Compatibility Consistency Applicability Error Propagation Recognizing and Avoiding Error Metadata Here’s the outline of what we’ll be talking about today. We’ll start by discussing Error in individual data sets, and then expand that to multiple data sets. I’ll briefly give you some suggestions on how to Recognize and attempt to Avoid Error Finally, I’ll talk about Documenting your data with Metadata So, to start, what is Error? Anybody? Is it good or bad? Do you think it’s possible to be 100% error-free in a GIS data set? Why or why not?

Accuracy and Precision Accuracy: the degree to which information on a map or in a digital database matches true or accepted values Precision: the level of measurement and exactness of description in a GIS database Error: inaccuracy and imprecision of data Accuracy and Precision can be applied to both spatial and non-spatial data Spatial – often refers to scale of map from which data is derived can also refer to GPS data Non-spatial – describes level of detail in the attribute data “Garbage in, garbage out” We describe error using two measures of quality: Accuracy and Precision Error is the amount of Inaccuracy and Imprecision There is also Data Quality, which indicates how good the data is; in other words, it is a combination of the Accuracy & Precision Error can be applied to both Spatial and Non-Spatial data

Accuracy and Precision Accurate Imprecise Inaccurate Precise Here are some graphical examples of what is meant by accuracy and precision in GIS. These are Spatial examples. Non-spatial example of Accuracy: Population counts that are close to the true number of people Non-spatial example of Precision: The more information, the more precise (I.e. better description) Can also be the number of decimal places A good rule of thumb for precision when creating derivatives of data: Add one decimal place Example: If your population data is rounded to a whole number, then any data created from it (like Pop. Density) should only be taken out to 1 decimal place

Sources of Inaccuracy and Imprecision Obvious Sources of Error Age of Data Areal Coverage Map Scale Density of Observations Relevance (use of “surrogate” data) Data Format Accessibility Cost Error from Natural Variation or Original Measurements Positional Accuracy Accuracy of Content Variation in the Data Processing Errors Numerical Errors Topological Errors Classification and Generalization Digitizing and Geocoding Age - may be too old to be relevant Areal Coverage - may not cover full area completely; lack at borders; cloud cover on imagery Scale – think about points between contour lines. Larger scale maps would have more detail (more contour lines, smaller contour intervals) and allow for better estimates of the points Density of Obs - are there enough obs. to justify your level of detail? Relevance - if you’re using data to indirectly measure something, is it appropriate? Format - has the data been transformed from one format to another many times? Think about raster compression formats like MrSid, these save a lot of space, but also lose detail. Accessibility - can you get the data that you need? If not, then less accurate data may have to be used. Cost - can you afford data of high accuracy? If not, then less accurate data may have to be used Positional - fuzzy boundaries: soil, vegetation, biomes Content - correct attribution? Caused by sloppiness or bad calibration of equipment Data Variation - is the nature of the data constant throughout time or does it vary? Numerical - errors in computation by the computer Topological - overshoots, slivers, dangles from Overlay Analysis Class & Gen - any analysis done on classes is affected by classification scheme; can incur error Dig & Geocode - errors in line digitizing

Scale Effects on Position 1:12,500 1:25,000 1:50,000 1:100,000 1:250,000 1:1,000,000 Horizontal Accuracy 9.5 m 12.7 m 25.4 m 50.8 m 126.9 m 507.9 m From: US National Map Accuracy Standards

Error Sources Associated With Digitizing

Spatial Data Error Location errors Attribute errors Example: a schoolhouse is located 30 feet away from its marked location on a map A 300 meter contour line is offset 5 meters to the northwest A satellite image pixel is located 2.4 meters away from its actual location on the ground Attribute errors A schoolhouse is incorrectly labeled as a church A 300 meter contour line is actually supposed to be a 310 meter contour line A 300 meter contour line actually represents an elevation of 302 meters A classified satellite image pixel is labeled forest when it is actually a field

Spatial Data Error One data point – error/accuracy can be easily defined. Data sets/maps – error/accuracy must be summarized. How is accuracy determined and summarized? Very accurate data must be collected (sampled) about a subset of the full dataset/map. This accurate sample is then compared with the original data A summary is created that compares these 2 datasets (the sample with the same measurements from the original data)

Spatial Data Error Locational data accuracy can be summarized with Root Mean Square Error (RMSE). A kind of average of the distance points/pixels are represented from their actual location on the ground. Locational data can also be summarized in other ways: For example: For horizontal data, the USGS uses the US National Mapping Accuracy Standards: 90% of all measurable points are within 1/50 of an inch for maps of spatial scale less than or equal to 1:20,000, and within 1/30 of an inch for maps of spatial scale greater than 1:20,000.

Different scales can lead to different boundaries, even though the boundaries may be fuzzy and inexact

Error Error is unbiased when the error is in ‘random’ directions GPS data Human error in surveying points Error is biased when there is systematic variation in accuracy within a geographic data set Example: GIS tech mistypes coordinate values when entering control points to register map to digitizing tablet all coordinate data from this map is systematically offset (biased) Example: the wrong datum is being used

Error when Using Multiple Data Sets Error Propagation – one error leads to another using a mis-registered point to register another layer additive effect E.g., what happens if layer digitized with a spatial bias problem is used as the spatial reference to create another, new layer? Error Cascading – erroneous, imprecise and inaccurate information will skew a GIS solution when information is combined selectively into new layers errors propagate from layer to layer repeatedly effect can be additive or multiplicative When using multiple data sets, any error in one data set may influence another data set, and it may be compounded through your analyses When error in one data set leads to error in a second data set, we call that Error Propagation. This often has an additive effect. When the error propagates from data set to data set repeatedly and ultimately skews the results, this is Error Cascading. The effect can be additive or multiplicative, but it is often very difficult to tell.

Propagation & Cascading

Using Multiple Data Sets Four Data Quality Considerations: Completeness A complete data set will cover the study area and time period in its entirety No data set is 100% complete Compatible Data sets must be compatible with one another Scale, data capture methods, etc. Consistency There must be consistency between and within data sets Data development, data capture methods Applicability Data must be appropriate for your intended use When using multiple data sets together, there are four Quality Considerations that you must keep in mind. Completeness - covers area and time period completely; never 100% complete Sample data is not complete by its nature Time-series data is never complete Must determine acceptable level of completeness for all data sets Compatible - use data sets together in a sensible manner similar scales, similar data capture methods, etc Consistency - must be consistency between and within data sets Data development, data capture, etc. If different parts of a data set were done by more than one person, were the same standards kept? (e.g. contours) This often goes hand-in-hand with compatibility Applicability - is your data appropriate for your uses? Don’t use a DEM to help you determine the spread of HIV!!

Documenting Your Data – Metadata Metadata - data about data Used to document all aspects of a data set Allows the user to determine the usefulness of data set Organizations want to maintain their investment To share information about available data Data catalogs & clearinghouses To aid data transfer & appropriate use Metadata standards set by the Federal Geographic Data Committee (FGDC) http://www.fgdc.gov All data distributed on the web and by sanctioned data distributors should have FGDC-compliant metadata Metadata is data about data. This documentation allows the user to understand all aspects of the data set, from it’s coordinate system and projection, to its attributes, to its processing history, and so on. It also allows the user to determine if the data is useful for his/her purposes. The Official Metadata standards were originally set down in 1994 by the FGDC. Here are a couple of web sites that you can look at if you’re interested. There are strict regulations on the construction of metadata that must be followed for it to be FGDC compliant. Compliance sets a common standard for everyone to use, and makes the exchange of data much easier.

Metadata in ArcGIS Visible in ArcCatalog Contained in the .xml part of a shapefile Maintain investment by cataloguing & noting appropriate use of data

Reminders Case study #7 will be on Friday (Oct. 5th) Mid-term study guide will be posted online on Friday (Oct. 5th) Mid-term review will be on Monday (Oct. 8th) Come with questions Mid-term exam will be next Wednesday (Oct. 10) Lab 3 is due next Friday (Oct. 12th) This was written incorrectly in the Lab 3 document