Introduction to GIS: Lecture #7 (GIS Analysis) GIS Analysis Describing Attributes Statistical Analysis Spatial Description Spatial Analysis Searching for.

Slides:



Advertisements
Similar presentations
Richard M. Jacobs, OSA, Ph.D.
Advertisements

Random Sampling and Data Description
Introduction to Summary Statistics
Objectives (BPS chapter 24)
QUANTITATIVE DATA ANALYSIS
Why Is It There? Lecture 6 Introduction to Geographic Information Systems Geography 176A 2006 Summer, Session B Department of Geography University of California,
The Simple Regression Model
Why Is It There? Getting Started with Geographic Information Systems Chapter 6.
Topics: Regression Simple Linear Regression: one dependent variable and one independent variable Multiple Regression: one dependent variable and two or.
Social Research Methods
Business Statistics - QBM117 Statistical inference for regression.
Correlation and Regression Analysis
Relationships Among Variables
Math 116 Chapter 12.
© 2005 The McGraw-Hill Companies, Inc., All Rights Reserved. Chapter 12 Describing Data.
Describing distributions with numbers
Fall 2013 Lecture 5: Chapter 5 Statistical Analysis of Data …yes the “S” word.
6.1 What is Statistics? Definition: Statistics – science of collecting, analyzing, and interpreting data in such a way that the conclusions can be objectively.
ITEC6310 Research Methods in Information Technology Instructor: Prof. Z. Yang Course Website: c6310.htm Office:
Go to Index Analysis of Means Farrokh Alemi, Ph.D. Kashif Haqqi M.D.
Quantitative Skills: Data Analysis
Statistical Analysis Mean, Standard deviation, Standard deviation of the sample means, t-test.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Why Is It There? Getting Started with Geographic Information Systems Chapter 6.
Analyzing and Interpreting Quantitative Data
Lecture 5: Chapter 5: Part I: pg Statistical Analysis of Data …yes the “S” word.
TYPES OF STATISTICAL METHODS USED IN PSYCHOLOGY Statistics.
Statistical analysis Outline that error bars are a graphical representation of the variability of data. The knowledge that any individual measurement.
Measures of central tendency are statistics that express the most typical or average scores in a distribution These measures are: The Mode The Median.
PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
An Introduction to Statistics. Two Branches of Statistical Methods Descriptive statistics Techniques for describing data in abbreviated, symbolic fashion.
Statistics - methodology for collecting, analyzing, interpreting and drawing conclusions from collected data Anastasia Kadina GM presentation 6/15/2015.
Dr. Serhat Eren 1 CHAPTER 6 NUMERICAL DESCRIPTORS OF DATA.
Chapter 3 For Explaining Psychological Statistics, 4th ed. by B. Cohen 1 Chapter 3: Measures of Central Tendency and Variability Imagine that a researcher.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Why is it there? (How can a GIS analyze data?) Getting Started, Chapter 6 Paula Messina.
Chapter Eight: Using Statistics to Answer Questions.
Unit 2 (F): Statistics in Psychological Research: Measures of Central Tendency Mr. Debes A.P. Psychology.
Data Analysis.
Chapter 6: Analyzing and Interpreting Quantitative Data
RESEARCH & DATA ANALYSIS
PCB 3043L - General Ecology Data Analysis.
IMPORTANCE OF STATISTICS MR.CHITHRAVEL.V ASST.PROFESSOR ACN.
LIS 570 Summarising and presenting data - Univariate analysis.
Measurements and Their Analysis. Introduction Note that in this chapter, we are talking about multiple measurements of the same quantity Numerical analysis.
Statistics Josée L. Jarry, Ph.D., C.Psych. Introduction to Psychology Department of Psychology University of Toronto June 9, 2003.
(Unit 6) Formulas and Definitions:. Association. A connection between data values.
Why Is It There? Chapter 6. Review: Dueker’s (1979) Definition “a geographic information system is a special case of information systems where the database.
NURS 306, Nursing Research Lisa Broughton, MSN, RN, CCRN RESEARCH STATISTICS.
Week 2 Normal Distributions, Scatter Plots, Regression and Random.
STATS DAY First a few review questions. Which of the following correlation coefficients would a statistician know, at first glance, is a mistake? A. 0.0.
Stats Methods at IC Lecture 3: Regression.
Outline Sampling Measurement Descriptive Statistics:
MATH-138 Elementary Statistics
Data Mining: Concepts and Techniques
PCB 3043L - General Ecology Data Analysis.
CHAPTER 12 More About Regression
Social Research Methods
STATS DAY First a few review questions.
Research Statistics Objective: Students will acquire knowledge related to research Statistics in order to identify how they are used to develop research.
Correlation and Regression
Introduction to Statistics
Chapter 10 Correlation and Regression
Data analysis and basic statistics
CHAPTER 12 More About Regression
15.1 The Role of Statistics in the Research Process
MBA 510 Lecture 2 Spring 2013 Dr. Tonya Balan 4/20/2019.
CHAPTER 12 More About Regression
Chapter Nine: Using Statistics to Answer Questions
Advanced Algebra Unit 1 Vocabulary
Presentation transcript:

Introduction to GIS: Lecture #7 (GIS Analysis) GIS Analysis Describing Attributes Statistical Analysis Spatial Description Spatial Analysis Searching for Spatial Relationships GIS and Spatial Analysis

Introduction to GIS: Lecture #7 (GIS Analysis) Dueker (1979) “a geographic information system is a special case of information systems where the database consists of observations on spatially distributed features, activities or events, which are definable in space as points, lines, or areas. A geographic information system manipulates data about these points, lines, and areas to retrieve data for ad hoc queries and analyses ".

Introduction to GIS: Lecture #7 (GIS Analysis) GIS is capable of data analysis Attribute Data –Describe with statistics –Analyze with hypothesis testing Spatial Data –Describe with maps –Analyze with spatial analysis

Introduction to GIS: Lecture #7 (GIS Analysis) Attribute Description The extremes of an attribute are the highest and lowest values, and the range is the difference between them in the units of the attribute. A histogram is a two-dimensional plot of attribute values grouped by magnitude and the frequency of records in that group, shown as a variable- length bar. For a large number of records with random errors in their measurement, the histogram resembles a bell curve and is symmetrical about the mean.

Introduction to GIS: Lecture #7 (GIS Analysis) If the records are: Text –Semantics of text e.g. “Hampton” –word frequency e.g. “Creek”, “Kill” –address matching Example: Display all places called “State Street”

Introduction to GIS: Lecture #7 (GIS Analysis) If the records are: Classes –histogram by class –numbers in class –contiguity description, e.g. average neighbor (roads, commercial)

Introduction to GIS: Lecture #7 (GIS Analysis) Describing a classed raster grid P (blue) = 19/48

Introduction to GIS: Lecture #7 (GIS Analysis) If the records are: Numbers –statistical description –min, max, range –variance –standard deviation

Introduction to GIS: Lecture #7 (GIS Analysis) Statistical description Range : min, max, max-min Central tendency : mode, median (odd, even), mean Variation : variance, standard deviation

Introduction to GIS: Lecture #7 (GIS Analysis) Statistical description Range : outliers mode, median, mean Variation : variance, standard deviation

Introduction to GIS: Lecture #7 (GIS Analysis) Elevation (book example)

Introduction to GIS: Lecture #7 (GIS Analysis) GPS Example Data: Elevation Table 6.2: Sample GPS Readings Data ExtremeDateTimeDMSDMS Elev Minimum 6/14/9510:47am Maximum 6/15/9510:47pm Range 1 Day12 hours

Introduction to GIS: Lecture #7 (GIS Analysis) Mean Statistical average Sum of the values for one attribute divided by the number of records X i i1= n = X / n

Introduction to GIS: Lecture #7 (GIS Analysis) Computing the Mean Sum of attribute values across all records, divided by the number of records. Add all attribute values down a column, / by # records A representative value, and for measurements with normally distributed error, converges on the true reading. A value lacking sufficient data for computation is called a missing value. Does not get included in sum or n.

Introduction to GIS: Lecture #7 (GIS Analysis) Variance The total variance is the sum of each record with its mean subtracted and then multiplied by itself. The standard deviation is the square root of the variance divided by the number of records less one. For two values, there is only one variance.

Introduction to GIS: Lecture #7 (GIS Analysis) l Average difference from the mean l Sum of the mean subtracted from the value for each record, squared, divided by the number of records- 1, square rooted. st.dev. = (X - X ) 2 i  n - 1 Standard Deviation

Introduction to GIS: Lecture #7 (GIS Analysis) GPS Example Data: Elevation Standard deviation Same units as the values of the records, in this case meters. Average amount readings differ from the average Can be above of below the mean Elevation is the mean (459.2 meters) plus or minus the expected error of meters Elevation is most likely to lie between meters and meters. These limits are called the error band or margin of error.

Introduction to GIS: Lecture #7 (GIS Analysis) Mean % % The Bell Curve

Introduction to GIS: Lecture #7 (GIS Analysis) Samples and populations A sample is a set of measurements taken from a larger group or population. Sample means and variances can serve as estimates for their populations. Easier to measure with samples, then draw conclusions about entire population.

Introduction to GIS: Lecture #7 (GIS Analysis) Testing Means Mean elevation of meters standard deviation meters what is the chance of a GPS reading of meters? is 25.3 meters above the mean 0.31 standard deviations ( Z-score) of the curve lies between the mean and this value beyond it

Introduction to GIS: Lecture #7 (GIS Analysis) Hypothesis testing Set up NULL hypothesis (e.g. Values or Means are the same) as H 0 Set up ALTERNATIVE hypothesis. H 1 Test hypothesis. Try to reject NULL. If null hypothesis is rejected alternative is accepted with a calculable level of confidence.

Introduction to GIS: Lecture #7 (GIS Analysis) Testing the Mean Mathematical version of the normal distribution can be used to compute probabilities associated with measurements with known means and standard deviations. A test of means can establish whether two samples from a population are different from each other, or whether the different measures they have are the result of random variation.

Introduction to GIS: Lecture #7 (GIS Analysis) Alternative attribute histograms Skewed above meanAll measurements equally right No error in measurement Random with some Decrease away from zero Bimodal (two averages) systematic error Normal distribution Random distribution

Introduction to GIS: Lecture #7 (GIS Analysis) Accuracy Determined by testing measurements against an independent source of higher fidelity and reliability. Must pay attention to units and significant digits. Can be expressed as a number using statistics (e.g. expected error). Accuracy measures imply accuracy users.

Introduction to GIS: Lecture #7 (GIS Analysis) The difference is the map GIS data description answers the question: Where? GIS data analysis answers the question: Why is it there? GIS data description is different from statistics because the results can be placed onto a map for visual analysis.

Introduction to GIS: Lecture #7 (GIS Analysis) Spatial Statistical Description For coordinates, the means and standard deviations correspond to the mean center and the standard distance A centroid is any point chosen to represent a higher dimension geographic feature, of which the mean center is only one choice. The standard distance for a set of point spatial measurements is the expected spatial error.

Introduction to GIS: Lecture #7 (GIS Analysis) Spatial Statistical Description For coordinates, data extremes define the two corners of a bounding rectangle.

Introduction to GIS: Lecture #7 (GIS Analysis) Mean Center mean y mean x

Introduction to GIS: Lecture #7 (GIS Analysis) Centroid: mean center of a feature

Introduction to GIS: Lecture #7 (GIS Analysis) Comparing spatial means

Introduction to GIS: Lecture #7 (GIS Analysis) GIS and Spatial Analysis Descriptions of geographic properties such as shape, pattern, and distribution are often verbal Quantitative measure can be devised, although few are computed by GIS. GIS statistical computations are most often done using retrieval options such as buffer and spread. Also by manipulating attributes with arithmetic commands (map algebra).

Introduction to GIS: Lecture #7 (GIS Analysis) Statistics and features Figure 6.8 Numbers that describe points, lines, and areas. Points Lines Areas HIGH VALUEMEDIUM VALUELOW VALUE Bounding rectangle Standard distance Number of points Length of line Length index Area in square units Boundary length Number of Holes Area/area of bounding rectangle Area of largest enclosed circle

Introduction to GIS: Lecture #7 (GIS Analysis) An example Lower 48 United States 1996 Data from the U.S. Census on gender Gender Ratio = # females per 100 males Range is What does the spatial distribution look like?

Introduction to GIS: Lecture #7 (GIS Analysis) Gender Ratio by State: 1996

Introduction to GIS: Lecture #7 (GIS Analysis) Searching for Spatial Pattern A linear relationship is a predictable straight- line link between the values of a dependent and an independent variable. (y = a + bx) It is a simple model of the relationship. A linear relation can be tested for goodness of fit with least squares methods. The coefficient of determination r-squared is a measure of the degree of fit, and the amount of variance explained.

Introduction to GIS: Lecture #7 (GIS Analysis) Simple linear relationship dependent variable independent variable observation best fit regression line y = a + bx intercept gradient y=a+bx

Introduction to GIS: Lecture #7 (GIS Analysis) Testing the relationship gr = long.

Introduction to GIS: Lecture #7 (GIS Analysis) Patterns in Residual Mapping Differences between observed values of the dependent variable and those predicted by a model are called residuals. A GIS allows residuals to be mapped and examined for spatial patterns. A model helps explanation and prediction after the GIS analysis. A model should be simple, should explain what it represents, and should be examined in the limits before use. We should always examine the limits of the model’s applicability (e.g. Does the regression apply to Europe?)

Introduction to GIS: Lecture #7 (GIS Analysis) Mapping residuals from a model

Introduction to GIS: Lecture #7 (GIS Analysis) Unexplained variance More variables?More variables? Different extent?Different extent? More records?More records? More spatial dimensions?More spatial dimensions? More complexity?More complexity? Another model?Another model? Another approach?Another approach?

Introduction to GIS: Lecture #7 (GIS Analysis) Example: Spatial Analysis meters to water table ? resolution? extent? accuracy? precision? boundary effects? point spacing?

Introduction to GIS: Lecture #7 (GIS Analysis) GIS and Spatial Analysis Geographic inquiry examines the relationships between geographic features collectively to help describe and understand the real-world phenomena that the map represents. Spatial analysis compares maps, investigates variation over space, and predicts future or unknown maps. Many GIS systems have to be coaxed to generate a full set of spatial statistics.

Introduction to GIS: Lecture #7 (GIS Analysis) Analytic Tools and GIS Tools for searching out spatial relationships and for modeling are only lately being integrated into GIS. Statistical and spatial analytical tools are also only now being integrated into GIS, and many people use separate software systems outside the GIS. Real geographic phenomena are dynamic, but GISs have been mostly static. Time-slice and animation methods can help in visualizing and analyzing spatial trends. GIS places real-world data into an organizational framework that allows numerical description and allows the analyst to model, analyze, and predict with both the map and the attribute data.