Reclassification Methods From important a research topic to trivial computer functions Is it to easy?

Slides:



Advertisements
Similar presentations
Making Effective Maps Efficiently AIM: make the appropriate number of maps swiftly, without multiple revisions, that communicates to the intended audiences.
Advertisements

I OWA S TATE U NIVERSITY Department of Animal Science Using Basic Graphical and Statistical Procedures (Chapter in the 8 Little SAS Book) Animal Science.
Statistics It is the science of planning studies and experiments, obtaining sample data, and then organizing, summarizing, analyzing, interpreting data,
Return to Outline Copyright © 2009 by Maribeth H. Price 2-1 Chapter 2 Mapping GIS Data.
Chapter 4: Image Enhancement
Ana Jerončić. about half (71+37=108)÷200 = 54% of the bills are “small”, i.e. less than 30 EUR There are only a few telephone bills in the middle range.
Advanced GIS Using ESRI ArcGIS 9.3 ADVANCED VISUALIZATION.
Introduction to GIS Lecture 2: Part 1. Understanding Spatial Data Structures Part 2. Legend editing & choropleth mapping Part 3. Map layouts.
Measures of Dispersion
Measures of Dispersion or Measures of Variability
B a c kn e x t h o m e Classification of Variables Discrete Numerical Variable A variable that produces a response that comes from a counting process.
©2005 by Austin Troy. All rights reserved Lecture 5: Introduction to GIS Legend Visualization Lecture by Austin Troy, University of Vermont.
ESRM 250 & CFR 520: Introduction to GIS © Phil Hurvitz, KEEP THIS TEXT BOX this slide includes some ESRI fonts. when you save this presentation,
ArcGIS Overview Lecture 1: Software Layer characteristics Thematic maps.
Describing Data: Numerical
Exploratory Data Analysis. Computing Science, University of Aberdeen2 Introduction Applying data mining (InfoVis as well) techniques requires gaining.
Dr. Serhat Eren DESCRIPTIVE STATISTICS FOR GROUPED DATA If there were 30 observations of weekly sales then you had all 30 numbers available to you.
In this presentation we will elaborate more on the importance of Choropleth Maps, Group Layers, Scales, Attribute Classification, Definition Queries, Hyperlinks,
Census A survey to collect data on the entire population.   Data The facts and figures collected, analyzed, and summarized for presentation and.
Numerical Descriptive Techniques
Chapter Twelve Census: Population canvass - not really a “sample” Asking the entire population Budget Available: A valid factor – how much can we.
© Copyright McGraw-Hill CHAPTER 3 Data Description.
Price Ch. 2 Mapping GIS Data ‣ GIS Concepts GIS Concepts Ways to map data Displaying rasters Classifying numeric data.
Chapter 2 Describing Data.
By: Amani Albraikan 1. 2  Synonym for variability  Often called “spread” or “scatter”  Indicator of consistency among a data set  Indicates how close.
Introduction to ArcGIS for Environmental Scientists Module 1 – Data Visualization Chapter 3 – Symbology and Labeling.
Biostatistics Class 1 1/25/2000 Introduction Descriptive Statistics.
Numerical Statistics Given a set of data (numbers and a context) we are interested in how to describe the entire set without listing all the elements.
Univariate Descriptive Statistics Chapter 2. Lecture Overview Tabular and Graphical Techniques Distributions Measures of Central Tendency Measures of.
Advanced GIS Using ESRI ArcGIS 9.3 Spatial Analyst 2.
1 Chapter 7 – The Choropleth Map Data Classification.
 The mean is typically what is meant by the word “average.” The mean is perhaps the most common measure of central tendency.  The sample mean is written.
MAPS AND VISUALIZATIONS
Introduction to GIS Lecture 2: Part 1. Understanding Spatial Data Structures Part 2. Legend editing, choropleth mapping and layouts Part 3. Map layouts.
GEOG 370 Christine Erlien, Instructor
Lecture 4 Dustin Lueker.  The population distribution for a continuous variable is usually represented by a smooth curve ◦ Like a histogram that gets.
Descriptive Statistics Tabular and Graphical Displays –Frequency Distribution - List of intervals of values for a variable, and the number of occurrences.
Exploratory Spatial Data Analysis (ESDA) Analysis through Visualization.
Statistics topics from both Math 1 and Math 2, both featured on the GHSGT.
LIS 570 Summarising and presenting data - Univariate analysis.
Displaying your data and using Classify Exploring how to use the legend classify command.
Quantitative vs. Qualitative Data
Geographer's WorkBench G.E.M. Geotechnologies 2001 Mapping Classification techniques Groups of Features with Similar Values.
Chapter Eleven Sample Size Determination Chapter Eleven.
Copyright © 2006 by Maribeth H. Price 4-1 Chapter 4 Drawing and Symbolizing Features.
Review of Classification Techniques Lumpers or Splitters?
STATISTICS Chapter 2 and and 2.2: Review of Basic Statistics Topics covered today:  Mean, Median, Mode  5 number summary and box plot  Interquartile.
Applied Cartography and Introduction to GIS GEOG 2017 EL Lecture-5 Chapters 9 and 10.
Chapter 4 – Statistics II
Chapter 16: Exploratory data analysis: numerical summaries
Chapter 2 Mapping GIS Data.
Data Mining: Concepts and Techniques
Displaying Data ENVS 521 Lecture 4.
Key Terms Symbology Categorical attributes Style Layer file.
Lecture 02 The Basics pf Creating a GIS Map
Data Representation and Mapping
CHAPTER 3 Data Description 9/17/2018 Kasturiarachi.
Lecture 5,6: Measures in Statistics
Summary descriptive statistics: means and standard deviations:
Descriptive Statistics
Classification Ming-Chun Lee.
Module 5.
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
Quantitative vs. Qualitative Data
What would be the typical temperature in Atlanta?
Numerical Descriptive Statistics
Map Generalization and Data Classification Gary Christopherson
Measures of Dispersion
Presentation transcript:

Reclassification Methods From important a research topic to trivial computer functions Is it to easy?

In the past Important cartographic process, map-makers only had one chance to do it right. R/I  N/O transformation. Goal was to classify and put into categories attribute information that best preserved the distribution of the data and convey its meaning based on the objective of the map composition. Researchers developed many reclassification techniques with different advantages. Each will give a different representation. Today, within a GIS, it is easy to classify data and it is sometimes done with little thought (i.e. use default). The importance of creating meaning visualizations that convey information to stakeholders has not changed.

Factors to Consider Distribution of the data (Uniform, Gaussian, Gamma, etc.). Audience (e.g. scientific vs. lay) Goals and Objectives –Highlight Rare –Highlight Common –Highlight areas of importance –Best preserves the distribution

Data Distribution Always look at the distribution of your data. Histograms are useful. You can always change the representation by changing the number of classes. However, most people can not make distinctions beyond 10 categories Census Tract Data, Tucson –Proportion of population between

Data Distribution 2000 Census Data, Tucson –Average Household Size

Manual User Defined – You create the class breaks If there is a logical way to reclassify the data based on original research, literature, prior work, traditional values, or common sense – DO IT Importantly, you should be able to write a justification of your procedure.

Equal Interval Scheme divides the range of attribute values into equal-sized subranges: –Class Interval = Data Range (high – low) / # of intervals This method emphasizes the amount of an attribute value relative to other values. For example you can show that a store is part of a group of stores that make up the upper 1/3 of all sales. Best to apply on familiar data ranges such as percentages or temperature. Advantages : Easy to understand concept, compute and understand the legend. Disadvantages: Does not consider data distribution, not acceptable for ordinal data.

Prop_18-29

Ave_HH_SZ

Defined Interval Defined Interval—You specify an interval to divide the range of cell values, and ArcMap determines the number of classes. Similar characteristics as Equal Interval.

Quantile Each Class contains an equal number of features (or cells in a raster). –# Observations per class = Total Obs. / # of Classes With raster data quantile and equal area are the same. Rules must be applied to keep like values together, so classes may not be equal, and in some cases missing. Maps may be misleading since similar features may be placed in different classes. Better for uniform or normal distributed data. Advantages: Easy to understand concept and compute. Acceptable for ordinal data. Disadvantages: Does not consider data distribution, hard to understand legend.

Prop_18-29

Ave_HH_SZ

Standard Deviation Shows you the amount a cell’s value varies from the mean. In this method you compute the mean value and then generate class breaks by successively adding and subtracting the standard deviation from the mean. Advantages: Considers data distribution, easy to understand concept, compute and understand legend, highlight outliers. Disadvantages: Best of a Gaussian distribution, need understanding in statistics to understand results, may not be good for lay audiences.

Prop_18-29

Ave_HH_SZ

Jenks Natural Breaks (Optimal) Determines the best arrangement of values into classes by minimizing the within-class sum of squared differences of values from the means of their class. The “optimal” arrangement is determined through an iterative process by looking at different sets of breaks in the data. Where A = set of values that have been ordered from 1 to N. 1<=i<j<=N Mean i..j = Mean of the class bounded by i and j.

Jenks Natural Breaks Advantages: Considers data distribution, can be used to determine best number of classes, relatively easy to understand concept and compute. Disadvantages: Hard to understand legend, can not be use for ordinal data. Current ESRI default.

Prop_18-29

Ave_HH_SZ

Geometrical Interval Class breaks are based on class intervals that have a geometrical series. The geometric coefficient in this classifier can change once (to its inverse) to optimize the class ranges. The algorithm creates these geometrical intervals by minimizing the square sum of element per class. This ensures that each class range has approximately the same number of values with each class and that the change between intervals is fairly consistent. Advantages: Relatively easy to compute and understand legend, considers data distribution Disadvantages: Hard to understand concept, can not be used for ordinal data

Prop_18-29

Ave_HH_SZ

Percentiles Uses the percentile breaks to determine class breaks. Order data (low  high), each value represents 1/n percentile of the total. Must break on unique values. Advantages: Relatively easy to compute and understand legend, considers data distribution, highlight outliers Disadvantages: Relatively hard to understand concept, not in ArcGIS.

From GEODA

Box Map Shows outliers as the function of quartiles. IQR = Q75 – Q25 Lower Outlier = Q25 – Hinge * IQR Upper Outlier = Q75 + Hinge * IQR Hinge is commonly either 1.5 or 3 Primary used to highlight outliers. Not in ArcGIS

From GEODA