Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Quality Issues-Chapter 10

Similar presentations


Presentation on theme: "Data Quality Issues-Chapter 10"— Presentation transcript:

1 Data Quality Issues-Chapter 10
GiGo: garbage in, garbage out Quality Issues Terminology Sources, propagation, and management What is Data Quality? Overall fitness or suitability of data for a specific purpose

2 Errors, Accuracy, Precision, & Bias
Difference between real world and GIS Could be one error or the whole thing is off Accuracy Extent in which an estimated value approaches a true value Can never get 100% accurate Precision Recorded level of detail

3 Errors, Accuracy, Precision, & Bias
Consistent error throughout data set Human, equipment Difficult to spot The usefulness of measurement is enhanced by knowledge of its level of certainty.  Multiple measurements of the same property are like multiple shots at the same target.  The pattern of the shots tells you something about the measurement and its ability to describe the 'true' value of the property being sought.   The patterns above depict possible outcomes of different experiments to measure the same property.  Expt IV is of course the best, because it give very reproducible results (precise) and also results that are very close to the true value or bulls eye (accurate).  Experiment III is precise but not accurate.  It exhibits systematic error, which is insidiously difficult to estimate at times.

4 Resolution Smallest feature or data that can be displayed
RasterCell size Vector-point size, line widths

5 Generalization Process of simplifying

6 Completeness & Consistency
Are all instances of a feature the GIS/map claims to include, in fact, there? Simply put, how much data is missing? Logical Consistency The presence of contradictory relationships in the database Some crimes recorded at place of occurrence, others at place where report taken Data for one country is for 2000, for another its for 2001 Annual data series not taken on same day/month etc. (sometimes called lineage error) Data uses different source or estimation technique for different years (again, lineage)

7 Compatibility Compatibility Slope Overlay maps different scales
Can not be combined Combining nominal and ratio Nominal scales distinguish one item from another, but they do not rank or quantify data. Soil Name, City Name, Polygon Identification Number Ordinal scales identify the relative magnitudes, but they do not quantify exact differences between values. Income = ( low , medium , or high) Slope = ( A , B ); where A = 0-4%, and B = 5-9% Crop

8 Applicability Applicability
Suitability of data for commands, operations or analysis Using your GIS data collected points for a parcel fabric

9 Sources of Error in GIS Survey Data
surveyor or instrument error choice of spheroid and datum Data encoding and entry E.g. keying or digitizing errors Remotely Sensed Data or Aerial Photography Mistakes in classification Change in time

10 Manual Digitizing Errors
Cleaning and editing always required

11 Vector to Raster or Raster to Vector

12 Errors in Data Processing and Analysis
is this data suitable for analysis? Is in a suitable format? Different datum's? Are the data sets compatible? Incompatible units? Widely different scales? Will the output mean anything?

13 Classification Errors

14 EVALUATING CURRENT DATA
Most of the information captured in a GIS generally exists somewhere in the office that requires the application. Some additional data may be purchased or obtained by data sharing with other agencies. The source, accuracy, reliability, condition and scale for each document or record must be evaluated.

15 SOURCE The data may be in paper or map form, or it may exist in computer files on another system. Where did that information come from? What is the source of the source? Do you know how the map was compiled? Do you know who compiled the map or record? Have you spoken with the author to learn as much as possible about the data? What are the strong & weak points about the data?

16 Data Accuracy & Reliability
There are different types of accuracy. Absolute positional accuracy refers to the measurement of map location as it relates to a real world location (For example; a GPS coordinate point). Relative positional accuracy is a measure of the relationships between the different features on the map. Relative accuracy compares the scaled distance between features measured from the map data with distances measured between the same features on the ground. The other type of accuracy deals with the content of the information in the GIS database. Are there errors or missing data? A road may have positional accuracy but have the wrong road name associated to the feature. We think of this as Reliability. Another very important aspect of reliability is how current the data sources are. If the map or record has not been properly maintained some method of bringing the document up to date must be instituted.

17 Data Accuracy & Reliability

18 MAINTENANCE OF DATA Many of the answers needed to insure proper data maintenance are flushed out in a preliminary needs and data analysis. Specifically, maintaining data involves knowing Frequency of change Quantity of change Sources of change It must be re-iterated: If data is not going to be maintained DO NOT PUT IT IN YOUR GIS.

19 Condition The condition of the source documents, especially maps, will determine how difficult the conversion will be. Clear mylar and ink drawings will be easier to digitize (no matter what the method) than maps of poor legibility.

20


Download ppt "Data Quality Issues-Chapter 10"

Similar presentations


Ads by Google