GG3019/GG4027/GG5019 An Introduction to

Slides:



Advertisements
Similar presentations
GG3019/GG4027/GG5019 An Introduction to Geographical Information Technology and GIS Geographical Information Systems and Geospatial Data Analysis David.
Advertisements

GG3019/GG4027/GG5019 An Introduction to
Methods for investigating zoning effects Mark Tranmer CCSR.
Chapter 5: Space and Form Form & Pattern Perception: Humans are second to none in processing visual form and pattern information. Our ability to see patterns.
Lecture 8: Testing, Verification and Validation
Mapping with GIS: When seeing should not always mean believing. Mr Oliver Tomlinson Senior Lecturer in Geographical Sciences School of Education Health.
The Role of Error Map and attribute data errors are the data producer's responsibility, GIS user must understand error. Accuracy and precision of map and.
Rulebase Expert System and Uncertainty. Rule-based ES Rules as a knowledge representation technique Type of rules :- relation, recommendation, directive,
Enhancing Data Quality of Distributive Trade Statistics Workshop for African countries on the Implementation of International Recommendations for Distributive.
Geographical Information Systems and Science Longley P A, Goodchild M F, Maguire D J, Rhind D W (2001) John Wiley and Sons Ltd 6. Uncertainty © John Wiley.
Managing Error, Accuracy, and Precision In GIS. Importance of Understanding Error *Until recently, most people involved with GIS paid little attention.
GIS Error and Uncertainty Longley et al., chs. 6 (and 15) Sources: Berry online text, Dawn Wright.
1 CPSC 695 Data Quality Issues M. L. Gavrilova. 2 Decisions…
Multiple Criteria for Evaluating Land Cover Classification Algorithms Summary of a paper by R.S. DeFries and Jonathan Cheung-Wai Chan April, 2000 Remote.
Introduction to Cartography GEOG 2016 E
Normalization of Database Tables
Marakas: Decision Support Systems, 2nd Edition © 2003, Prentice-Hall Chapter Chapter 4: Modeling Decision Processes Decision Support Systems in the.
Geog 458: Map Sources and Errors January 20, 2006 Data Storage and Editing.
Geog 458: Map Sources and Errors Uncertainty January 23, 2006.
Creating Architectural Descriptions. Outline Standardizing architectural descriptions: The IEEE has published, “Recommended Practice for Architectural.
1 Spatial Databases as Models of Reality Geog 495: GIS database design Reading: NCGIA CC ’90 Unit #10.
Lineage February 13, 2006 Geog 458: Map Sources and Errors.
Spatial data quality February 10, 2006 Geog 458: Map Sources and Errors.
GI Systems and Science January 23, Points to Cover  What is spatial data modeling?  Entity definition  Topology  Spatial data models Raster.
Data Acquisition Lecture 8. Data Sources  Data Transfer  Getting data from the internet and importing  Data Collection  One of the most expensive.
Introduction to the course January 9, Points to Cover  What is GIS?  GIS and Geographic Information Science  Components of GIS Spatial data.
Data Quality Data quality Related terms:
Data Quality Issues-Chapter 10
Sampling : Error and bias. Sampling definitions  Sampling universe  Sampling frame  Sampling unit  Basic sampling unit or elementary unit  Sampling.
Determining Sample Size
Copyright © Cengage Learning. All rights reserved. 8 Tests of Hypotheses Based on a Single Sample.
RESEARCH A systematic quest for undiscovered truth A way of thinking
Understanding and Interpreting maps
Chapter 3 Sections 3.5 – 3.7. Vector Data Representation object-based “discrete objects”
Chapter Nine Copyright © 2006 McGraw-Hill/Irwin Sampling: Theory, Designs and Issues in Marketing Research.
Confidence Interval Estimation
Fundamentals of Data Analysis Lecture 9 Management of data sets and improving the precision of measurement.
GIS Data Quality.
Role of Statistics in Geography
Chapter 3 Digital Representation of Geographic Data.
Support the spread of “good practice” in generating, managing, analysing and communicating spatial information Introduction to GIS for the Purpose of Practising.
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
DATA QUALITY AND ERROR  Terminology, types and sources  Importance  Handling error and uncertainty.
Chapter 15 GIS Design and Implementation Management Information Systems –Systems Analysis –Systems Design –Systems Implementation.
Introduction to Earth Science Section 2 Section 2: Science as a Process Preview Key Ideas Behavior of Natural Systems Scientific Methods Scientific Measurements.
GEOG3025 Geographical referencing and the modifiable areal unit problem.
GOS Economic Model (GEM) Overview Uses the same underlying simulation software (Stella) which was used in developing TNM Economic Model (NB-Sim) Provides.
Chapter 6: 1 Sampling. Introduction Sampling - the process of selecting observations Often not possible to collect information from all persons or other.
AN INTRODUCTION TO GIS SYSTEMS TAKEN AND MODIFIED FROM TEXT BY David J. Buckley Corporate GIS Solutions Manager Pacific Meridian Resources, Inc.
Bangor Transfer Abroad Programme Marketing Research SAMPLING (Zikmund, Chapter 12)
Introduction to Geographic Information Systems
GIS September 27, Announcements Next lecture is on October 18th (read chapters 9 and 10) Next lecture is on October 18th (read chapters 9 and 10)
This chapter talk about:  uncertainty  discusses its principles  cases and the sources of geographic uncertainty  The ways in which they operate in.
Spatial Data Models Geography is concerned with many aspects of our environment. From a GIS perspective, we can identify two aspects which are of particular.
CENTENNIAL COLLEGE SCHOOL OF ENGINEERING & APPLIED SCIENCE VS 361 Introduction to GIS ERROR, ACCURACY & PRECISION COURSE NOTES 1.
5 1 Chapter 5 Normalization of Database Tables Database Systems: Design, Implementation, and Management, Sixth Edition, Rob and Coronel.
Data Storage & Editing GEOG370 Instructor: Christine Erlien.
INTRODUCTION Despite recent advances in spatial analysis in transport, such as the accounting for spatial correlation in accident analysis, important research.
Establishing by the laboratory of the functional requirements for uncertainty of measurements of each examination procedure Ioannis Sitaras.
MECH 373 Instrumentation and Measurements
Lecture 24: Uncertainty and Geovisualization
Data Quality Data quality Related terms:
DSS & Warehousing Systems
Chapter 3 Raster & Vector Data.
SAMPLING (Zikmund, Chapter 12.
“Honest GIS”: Error and Uncertainty
Pest Risk Analysis (PRA) Stage 2: Pest Risk Assessment
URBDP 422 URBAN AND REGIONAL GEO-SPATIAL ANALYSIS
Building Valid, Credible, and Appropriately Detailed Simulation Models
Geographic Information Systems
Presentation transcript:

GG3019/GG4027/GG5019 www.abdn.ac.uk/geospatial An Introduction to Geographical Information Technology and GIS Systems and Geospatial Data Analysis David R. Green G12 – 2324 d.r.green@abdn.ac.uk www.abdn.ac.uk/geospatial

Error and Uncertainty in GIS

Error and Uncertainty in GIS

Error and Uncertainty in GIS Error is present at all stages in GIS e.g. Data Capture & Data Analysis Error is one form of uncertainty Missing, incompleteness, mistakes, and quality

Error and Uncertainty in GIS Real World Conception of spatial phenomena Measurement and representation of spatial phenomena Analysis of spatial phenomena

Error and Uncertainty in GIS CONCEPTION Spatial uncertainty Vagueness Ambiguity Scale of Geographic Individuals (zones/units)

Error and Uncertainty in GIS MEASUREMENT AND REPRESENTATION Accuracy and error Measurement error Data integration and shared lineage

Error and Uncertainty in GIS ANALYSIS Spatial analysis and uncertainty Aggregation and analysis (ecological fallacy) Scale and aggregation = Modifiable Area Unit Problem = different results Visualisation helps to study this problem

Error and Uncertainty in GIS The Ecological Fallacy is a situation that can occur when a researcher or analyst makes an inference about an individual based on aggregate data for a group. For example, a researcher might examine the aggregate data on income for a neighbourhood of a city, and discoverer that the average household income for the residents of that area is $30,000. To state that the average income for residents of that area is $30,000 is true and accurate. No problem there. The ecological fallacy can occur when the researcher then states, based on this data, that people living in the area earn about $30,000. This may not be true at all, and may be an ecological fallacy. Close examination of the neighbourhood might discover that the neighbourhood is actually composed of two housing estates, one of a lower socio-economic group of residents, and one of a higher socio-economic group. The poorer part of town residents earn on average $10,000 while the more affluent citizens can average $50,000. When the researcher stating that individuals who live in the area earn $30,000 (the mean rate) this did not account for the fact that the average in this example is constructed of two disparate groups, and it is likely that not one person earns $30,000. Assumptions made about individuals based on aggregate data are vulnerable to the ecological fallacy. This does not mean that identifying associations between aggregate figures is necessarily defective, and it doesn't necessarily mean that any inferences drawn about associations between the characteristics of an aggregate population and the characteristics of sub-units within the population are absolutely wrong either. What it does say is that the process of aggregating or disaggregating data may conceal the variations that are not visible at the larger aggregate level, and researchers, analysts and crime mappers should be careful.

Error and Uncertainty in GIS http://www.jratcliffe.net/research/ecolfallacy.htm

Error and Uncertainty in GIS The Modifiable Areal Unit Problem (MAUP) is a potential source of error that can affect spatial studies which utilise aggregate data sources (Unwin, 1996). Geographical data is often aggregated in order to present the results of a study in a more useful context, and spatial objects such as enumeration districts or police beat boundaries are examples of the type of aggregating zones used to show results of some spatial phenomena. These zones are often arbitrary in nature and different areal units can be just as meaningful in displaying the same base level data. For example, it could be argued that enumeration districts containing comparable numbers of houses are better sources of aggregation than police beats (which are often based on ancient parish boundaries in the UK) when displaying burglary rates. Large amounts of source data require a careful choice of aggregating zones to display the spatial variation of the data in a comprehensible manner. It is this variation in acceptable areal solution that generates the term 'modifiable'. Only recently (well, the last 20 years!) has this problem been addressed in the area of spatial crime analysis, where 'the areal units (zonal objects) used in many geographical studies are arbitrary, modifiable, and subject to the whims and fancies of whoever is doing, or did, the aggregating.' (Openshaw, 1984 p.3).

Error and Uncertainty in GIS The MAUP consists of both a scale and an aggregation problem, and the concept of the ecological fallacy should also be considered (Bailey and Gatrell, 1995). The scale problem is relatively well known. It is the variation which can occur when data from one scale of areal units is aggregated into more or less areal units. For example, much of the variation in enumeration districts changes or is lost when the data is aggregated to the ward or county level. The aggregation problem is less well known and becomes apparent when faced with the variety of different possible areal units for aggregation. Although geographical studies tend towards aggregating units which have a geographical boundary, it is possible to aggregate spatial units which are spatially distinct. Aggregating neighbours improves the problem to a small degree but does not get round the quantity of variation in possibilities which remains.

Error and Uncertainty in GIS Data Accuracy and Quality The quality of data sources for GIS processing is becoming an ever increasing concern among GIS application specialists With many GIS software on the commercial market and the accelerating application of GIS technology to problem solving and decision making roles, the quality and reliability of GIS products is coming under closer scrutiny Much concern has been raised as to the relative error that may be inherent in GIS processing methodologies While research is ongoing, and no finite standards have yet been adopted in the commercial GIS marketplace, several practical recommendations have been identified which help to locate possible error sources, and define the quality of data

Error and Uncertainty in GIS Three distinct components, data accuracy, quality, and error Accuracy The fundamental issue with respect to data is accuracy. Accuracy is the closeness of results of observations to the true values or values accepted as being true (estimates of the true value. The difference between observed and true (or accepted as being true) values indicates the accuracy of the observations Basically two types of accuracy exist: positional accuracy attribute accuracy

Error and Uncertainty in GIS Positional accuracy is the expected deviance in the geographic location of an object from its true ground position This is what we commonly think of when the term accuracy is discussed There are two components to positional accuracy. These are: relative accuracy absolute accuracy Absolute accuracy concerns the accuracy of data elements with respect to a coordinate scheme, e.g. UTM Relative accuracy concerns the positioning of map features relative to one another

Error and Uncertainty in GIS Relative accuracy is of much greater concern than absolute accuracy For example, most GIS users can live with the fact that their survey coordinates do not coincide exactly with the real world, but the absence of one or two units from e.g. a map may have costly consequences Attribute accuracy is equally as important as positional accuracy. It also reflects estimates of the truth. Interpreting and depicting boundaries and characteristics for forest stands or soil polygons can be exceedingly difficult and subjective Also the degree of homogeneity found within such mapped boundaries is not nearly as high in reality as it would appear to be on most maps!

Error and Uncertainty in GIS Quality Quality can simply be defined as the fitness for use for a specific data set. Data that is appropriate for use with one application may not be fit for use with another. It is fully dependent on the scale, accuracy, and extent of the data set, as well as the quality of other data sets to be used. Spatial Data Transfer Standards (SDTS) often identify the following components to data quality definitions. Lineage - Positional Accuracy - Attribute Accuracy - Logical Consistency - Completeness Lineage

Error and Uncertainty in GIS Lineage - historical and compilation aspects of the data such as the source of the data; content of the data; data capture specifications; geographic coverage of the data; compilation method of the data, e.g. digitizing versus scanned; ransformation methods applied to the data; and the use of an pertinent algorithms during compilation, e.g. linear simplification, feature generalization Positional Accuracy - This includes consideration of inherent error (source error) and operational error (introduced error) Attribute Accuracy - This quality component concerns the identification of the reliability, or level of purity (homogeneity), in a data set

Error and Uncertainty in GIS Logical Consistency This component is concerned with determining the faithfulness of the data structure for a data set. This typically involves spatial data inconsistencies such as incorrect line intersections, duplicate lines or boundaries, or gaps in lines. These are referred to as spatial or topological errors Completeness The final quality component involves a statement about the completeness of the data set. This includes consideration of holes in the data, unclassified areas, and any compilation procedures that may have caused data to be eliminated

Error and Uncertainty in GIS The ease with which geographic data in a GIS can be used at any scale highlights the importance of detailed data quality information. Although a data set may not have a specific scale once it is loaded into the GIS database, it was produced with levels of accuracy and resolution that make it appropriate for use only at certain scales, and in combination with data of similar scales. Error - Two sources of error: Inherent Operational Both contribute to the reduction in quality of the products that are generated by geographic information systems.

Error and Uncertainty in GIS Inherent error is the error present in source documents and data Operational error is the amount of error produced through the data capture and manipulation functions of a GIS Possible sources of operational errors include : * Mislabelling of areas on thematic maps * Misplacement of horizontal (positional) boundaries * Human error in digitizing classification error * GIS algorithm inaccuracies human bias While error will always exist in any scientific process, the aim within GIS processing should be to identify existing error in data sources and minimize the amount of error added during processing

Error and Uncertainty in GIS

Error and Uncertainty in GIS Because of cost constraints it is often more appropriate to manage error than attempt to eliminate it! There is a trade-off between reducing the level of error in a data base and the cost to create and maintain the database An awareness of the error status of different data sets will allow user to make a subjective statement on the quality and reliability of a product derived from GIS processing The validity of any decisions based on a GIS product is directly related to the quality and reliability rating of the product Depending upon the level of error inherent in the source data, and the error operationally produced through data capture and manipulation, GIS products may possess significant amounts of error

Error and Uncertainty in GIS One of the major problems currently existing within GIS is the aura of accuracy surrounding digital geographic data Often hardcopy map sources include a map reliability rating or confidence rating in the map legend This rating helps the user in determining the fitness for use for the map However, rarely is this information encoded in the digital conversion process Often because GIS data is in digital form and can be represented with a high precision it is considered to be totally accurate

Error and Uncertainty in GIS In reality, a buffer exists around each feature which represents the actual positional location of the feature For example, data captured at the 1:20,000 scale commonly has a positional accuracy of +/- 20 metres This means the actual location of features may vary 20 metres in either direction from the identified position of the feature on the map Considering that the use of GIS commonly involves the integration of several data sets, usually at different scales and quality, one can easily see how errors can be propagated during processing

Error and Uncertainty in GIS Example of areas of uncertainty for overlaying data Several comments and guidelines on the recognition and assessment of error in GIS processing have been promoted in papers on the subject There is a need for developing error statements for data contained within geographic information systems (Vitek et al, 1984) The integration of data from different sources and in different original formats (e.g. points, lines, and areas), at different original scales, and possessing inherent errors can yield a product of questionable accuracy (Vitek et al, 1984) The accuracy of a GIS-derived product is dependent on characteristics inherent in the source products, and on user requirements, such as scale of the desired output products and the method and resolution of data encoding (Marble, Peuquet, 1983)

Error and Uncertainty in GIS The highest accuracy of any GIS output product can only be as accurate as the least accurate data theme of information involved in the analysis (Newcomer, Szajgin, 1984). Accuracy of the data decreases as spatial resolution becomes more coarse (Walsh et al, 1987). and As the number of layers in an analysis increases, the number of possible opportunities for error increases

Error and Uncertainty in GIS Tools to get a handle on uncertainty Models of uncertainty: methods for assessing and describing error Error propagation (during analysis) Fuzzy approaches (membership of classes) Sensitivity analysis (effect of errors)

Error and Uncertainty in GIS Error assessment, reporting, interpretation - more difficult Quality of data: standards and metadata But: No professional GIS currently in use can present the user with information about the confidence limits that should be associated with the results of an analysis.

Error and Uncertainty in GIS Chapter 6 - Longley et al. Chapter 15 - Longley et al.

Error and Uncertainty in GIS Useful Links http://www.geog.ucsb.edu/~good/176b/m14.html http://www.colorado.edu/geography/gcraft/notes/error/error.html http://images.google.com/imgres?imgurl=http://www.geog.ubc.ca/courses/geog470/notes/images/sliver_polygon.gif&imgrefurl=http://www.geog.ubc.ca/courses/geog470/notes/error_accuracy.html&h=283&w=476&sz=4&tbnid=QH_DMozL2M4J:&tbnh=74&tbnw=126&hl=en&start=4&prev=/images%3Fq%3Dslivers%2Bin%2Ba%2BGIS%26svnum%3D10%26hl%3Den%26lr%3D%26rls%3DGGLD,GGLD:2004-07,GGLD:en www.sfu.ca/gis/geog_x55/web_354_new/icons/lec_11_error.pdf http://www.yogibob.com/303_403_f_04/303_lecture5.html