Presentation is loading. Please wait.

Presentation is loading. Please wait.

Why Is It There? Getting Started with Geographic Information Systems Chapter 6.

Similar presentations


Presentation on theme: "Why Is It There? Getting Started with Geographic Information Systems Chapter 6."— Presentation transcript:

1 Why Is It There? Getting Started with Geographic Information Systems Chapter 6

2 6 Why Is It There? l 6.1 Describing Attributes l 6.2 Statistical Analysis l 6.3 Spatial Description l 6.4 Spatial Analysis l 6.5 Searching for Spatial Relationships l 6.6 GIS and Spatial Analysis

3 Duecker (1979) " A geographic information system is a special case of information systems where the database consists of observations on spatially distributed features, activities or events, which are definable in space as points, lines, or areas. A geographic information system manipulates data about these points, lines, and areas to retrieve data for ad hoc queries and analyses ".

4 GIS is capable of data analysis l Attribute Data –Describe with statistics –Analyze with hypothesis testing l Spatial Data –Describe with maps –Analyze with spatial analysis

5 Describing one attribute

6 Attribute Description l The extremes of an attribute are the highest and lowest values, and the range is the difference between them in the units of the attribute. l A histogram is a two-dimensional plot of attribute values grouped by magnitude and the frequency of records in that group, shown as a variable-length bar. l For a large number of records with random errors in their measurement, the histogram resembles a bell curve and is symmetrical about the mean.

7 If the records are: l Text –Length of text –word frequency –address matching l Example: Display all places called “State Street”

8 If the records are: l Classes –histogram by class –numbers in class –contiguity description

9 Describing a classed raster grid 5 10 15 20 P (blue) = 19/48

10 If the records are: l Numbers –statistical description –min, max, range –variance and standard deviation

11 Statistical description l Range (min, max, max-min) l Central tendency (mode, median, mean) l Variation (variance, standard deviation)

12 Elevation (book example)

13 Mean l Statistical average l Sum of the values for one attribute divided by the number of records X i i1= n  = X

14 Computing the Mean l Sum of attribute values across all records, divided by the number of records. l A representative value, and for measurements with normally distributed error, converges on the true reading. l A value lacking sufficient data for computation is called a missing value.

15 Variance l The total variance is the sum of each record with its mean subtracted and then multiplied by itself. l The standard deviation is the square root of the variance divided by the number of records less one.

16 l Average difference from the mean l Sum of the mean subtracted from the value for each record, squared, divided by the number of records- 1, square rooted. st.dev. = (X - X ) 2 i  n - 1 Standard Deviation

17 GPS Example Data: Elevation Standard deviation l Same units as the values of the records, in this case meters. l The average amount by which the readings differ from the average l Can be above or below the mean l Elevation is the mean (459.2 meters), plus or minus the expected error of 82.92 meters l Elevation is most likely to lie between 376.28 meters and 542.12 meters. l These limits are called the error band or margin of error.

18 Hypothesis testing l Establish NULL hypothesis (e.g. Values or Means are the same) l Establish ALTERNATIVE hypothesis, based on some expectation. l Test hypothesis. Try to reject NULL. l If null hypothesis is rejected, there is some support for the alternative (theory-based) hypothesis.

19 Uses of the standard deviation l Shorthand description : given the mean and s.d., we know where 67% of a random distribution lies. l A standardized measure : –a score of 80% can be good or bad, depending on the mean and s.d.

20 Testing the Mean l A test of means can establish whether two samples from a population are different from each other, or whether the different measures they have are the result of random variation.

21 Samples and populations l A sample is a set of measurements taken from a larger group or population. l Sample means and variances can serve as estimates for their populations.

22 Spatial analysis with GIS l GIS data description answers the question: Where? l GIS data analysis answers the question: Why is it there? l GIS data description is different from statistics because the results can be placed onto a map for visual analysis.

23 Spatial Statistical Description l For coordinates, the means and standard deviations correspond to the mean center and the standard distance l A centroid is any point chosen to represent a higher dimension geographic feature, of which the mean center is only one choice. l The standard distance for a set of point spatial measurements is the expected spatial error.

24 Spatial Statistical Description l For coordinates, data extremes define the two corners of a bounding rectangle.

25 Geographic extremes l Southernmost point in the continental United States. l Range: e.g. elevation difference; map extent

26 Mean Center mean y mean x

27 Centroid: mean center of a feature

28 GIS and Spatial Analysis l Descriptions of geographic properties such as shape, pattern, and distribution are often verbal l Quantitative measure can be devised, although few are computed by GIS. l GIS statistical computations are most often done using retrieval options such as buffer and spread. l Also by manipulating attributes with arithmetic commands (map algebra).

29 An example l Lower 48 United States l 1994 Data from the U.S. Census on gender l Gender Ratio = # females per 100 males l Range is 97 - 108 l What does the spatial distribution look like?

30 Gender Ratio by State: 1994

31 Searching for Spatial Pattern l A linear relationship is a predictable straight-line link between the values of a dependent and an independent variable. It is a simple model of the relationship. l A linear relation can be tested for goodness of fit with least squares methods. The coefficient of determination r-squared is a measure of the degree of fit, and the amount of variance explained.

32 Simple linear relationship dependent variable independent variable observation best fit regression line y = a + bx intercept gradient y=a+bx

33 Testing the relationship gr = 117.46 + 0.138 long.

34 Patterns in Residual Mapping l Differences between observed values of the dependent variable and those predicted by a model are called residuals. l A GIS allows residuals to be mapped and examined for spatial patterns. l A model helps explanation and prediction after the GIS analysis. l A model should be simple, should explain what it represents, and should be examined in the limits before use.

35 Mapping residuals from a model

36 Unexplained variance l More variables? l Different extent? l More records? l More spatial dimensions? l More complexity? l Another model? l Another approach?

37 GIS and Spatial Analysis l Many GIS systems have to be coaxed to generate a full set of spatial statistics.

38 Analytic Tools and GIS l Tools for searching out spatial relationships and for modeling are only lately being integrated into GIS. l Statistical and spatial analytical tools are also only now being integrated into GIS, and many people use separate software systems outside the GIS: “loosely coupled” analyses.

39 Analytic Tools and GIS l Real geographic phenomena are dynamic, but GISs have been mostly static. Time-slice and animation methods can help in visualizing and analyzing spatial trends. l GIS organizes real-world data to allow numerical description and allows the analyst to model, analyze, and predict with both the map and the attribute data.

40 You can lie with... l Maps l Statistics l Correlation is not causation!


Download ppt "Why Is It There? Getting Started with Geographic Information Systems Chapter 6."

Similar presentations


Ads by Google