Presentation is loading. Please wait.

Presentation is loading. Please wait.

Geographic Information Systems (GIS): Spatial Analysis November 1, 2005.

Similar presentations


Presentation on theme: "Geographic Information Systems (GIS): Spatial Analysis November 1, 2005."— Presentation transcript:

1 Geographic Information Systems (GIS): Spatial Analysis November 1, 2005

2 Notes Oslo Project Groups Assignment Due Date: December 15, 2005 Mid-term quiz 2: November 8 Progress in GI Science eSeminar Series

3 Existing Groups 1.Marita Sanni, Julie Aaraas, Kristin I. Dankel, Solveig Melå (4) 2.Åslaug Enger Olsen, Maria Lyngstad, Guro Bakke Håndlykken og Jorunn Randby (M3) 3.Nina Ambro Knutsen, Ellen Winje og Leif Ingholm (3) 4.Birte Mobraaten, Hans Petter Wiken, Silje Hernes and Bente Lise Stubberud (4) 5.Daniel Molin, Ida Sjølander, Anne-Lise Folland and Nicolai Steineger (4) 6.Hæge Skjæveland, Marie Aaberge, Cecilie Hirsch, Kaja Korsnes Kristensen 7.Urs Dippon, Steven huiching Yip, Harald Kvifte & Eirik Waag 8.Marthe Stiansen, Marielle Stigum, Tomas Nesset,Andreas Skjetne 9.Gjermund Steinskog (Archaeaology – M16-18) 10.Solveig Lyby (Archaeaology - M10-12) 10.Andreas Dyken, Håkon Grevbo, Terje-Andre Gudmundsen (3)

4 Project Examples from 2004 Tilgjengelighet til legesentre i Bydel Grorud Innvandrernes bosettingsmønster Distinksjoner i Oslo: En Bourdieusk alanyse av ulikehet ved hjelp av geografiske informasonssystemer Sosiale skiller i Oslo Sosiale ulikheter i Oslo Inntekt og boligstruktur i Oslo: med fokus på bydel Gamle Oslo Privatisering og innntektsnivå i bydel Vestre Aker

5 GI Science eSeminar Series

6 Outline for Today’s lecture What is spatial analysis? Queries and reasoning Measurements Spatial Interpolation Descriptive Summaries Optimization Hypothesis Testing

7 Spatial Analysis Turns raw data into useful information by adding greater informative content and value Reveals patterns, trends, and anomalies that might otherwise be missed Provides a check on human intuition by helping in situations where the eye might deceive

8 Definitions A method of analysis is spatial if the results depend on the locations of the objects being analyzed move the objects and the results change results are not invariant (i.e., they vary!) under relocation Spatial analysis requires both attributes and locations of objects a GIS has been designed to store both

9 The Snow Map (cholera outbreaks in the 1850s) Provides a classic example of the use of location to draw inferences But the same pattern could arise from contagion (cholera spread through the air) if the original carrier lived in the center of the outbreak contagion was the hypothesis Snow was trying to refute. Today, a GIS could be used to show a sequence of maps as the outbreak developed contagion would produce a concentric sequence, drinking water a random sequence

10 Types of Spatial Analysis There are literally thousands of techniques Six categories are used in this course, each having a distinct conceptual basis: Queries and reasoning Measurements Transformations Descriptive summaries Optimization Hypothesis testing

11 Queries and Reasoning A GIS can respond to queries by presenting data in appropriate views and allowing the user to interact with each view It is often useful to be able to display two or more views at once and to link them together linking views is one important technique of exploratory spatial data analysis (ESDA)

12 The Catalog View Shows folders, databases, and files on the left, and a preview of the contents of a selected data set on the right. The preview can be used to query the data set’s metadata, or to look at a thumbnail map, or at a table of attributes. This example shows ESRI’s ArcCatalog.

13 The Map View A user can interact with a map view to identify objects and query their attributes, to search for objects meeting specified criteria, or to find the coordinates of objects. This illustration uses ESRI’s ArcMap.

14 The Table View Here attributes are displayed in the form of a table, linked to a map view. When objects are selected in the table, they are automatically highlighted in the map view, and vice versa. The table view can be used to answer simple queries about objects and their attributes.

15 Measurements Many tasks require measurement from maps measurement of distance between two points measurement of area, e.g. the area of a parcel of land Such measurements are tedious and inaccurate if made by hand measurement using GIS tools and digital databases is fast, reliable, and accurate

16 Measurement of Length A metric is a rule for determining distance from coordinates The Pythagorean metric gives the straight-line distance between two points on a flat plane The Great Circle metric gives the shortest distance between two points on a spherical globe given their latitudes and longitudes

17 Issues with Length Measurement The length of a true curve is almost always longer than the length of its polyline or polygon representation

18 Issues with Length Measurement Measurements in GIS are often made on horizontal projections of objects length and area may be substantially lower than on a true three-dimensional surface

19 Measurement of Area Calculate and sum the areas of a series of polygons, formed by dropping perpendiculars to the x axis. Subtract the area of the extended trapezium (in this case, a rectangle). The area for each polygon is calculated as the difference in x times the average of y. x1x1 x2x2 y1y1 y2y2

20 Measurement of Shape Shape measures capture the degree of contortedness of areas, relative to the most compact circular shape by comparing perimeter to the square root of area normalized so that the shape of a circle is 1 the more contorted the area, the higher the shape measure

21 Shape as an indicator of gerrymandering in elections The 12 th Congressional District of North Carolina was drawn in 1992 using a GIS, and designed to be a majority-minority district: with a majority of African American voters, it could be expected to return an African American to Congress. This objective was achieved at the cost of a very contorted shape. The U.S. Supreme Court eventually rejected the design.

22 Slope and Aspect Calculated from a grid of elevations (a digital elevation model) Slope and aspect are calculated at each point in the grid, by comparing the point’s elevation to that of its neighbors usually its eight neighbors but the exact method varies in a scientific study, it is important to know exactly what method is used when calculating slope, and exactly how slope is defined

23 Alternative Definitions of Slope The angle between the surface and the horizontal, range 0 to 90 The ratio of the change in elevation to the actual distance traveled, range 0 to 1 The ratio of the change in elevation to the horizontal distance traveled, range 0 to infinity

24 Transformations Create new objects and attributes, based on simple rules involving geometric construction or calculation may also create new fields, from existing fields or from discrete objects

25 Buffering (Dilation) Create a new object consisting of areas within a user-defined distance of an existing object e.g., to determine areas impacted by a proposed highway e.g., to determine the service area of a proposed hospital Feasible in either raster or vector mode

26 Buffering Point Line Polygon

27 Raster Buffering Generalized Vary the distance buffered according to values in a friction layer City limits Areas reachable in 5 minutes Areas reachable in 10 minutes Other areas

28 Point in Polygon Transformation Determine whether a point lies inside or outside a polygon (enclosure) Basis for answering many simple queries used to assign crimes to police precincts, voters to voting districts, accidents to reporting counties

29 The Point in Polygon Algorithm Draw a line from the point to infinity in any direction, and count the number of intersections between this line and each polygon’s boundary. The polygon with an odd number of intersections is the containing polygon: all other polygons have an even number of intersections.

30 Polygon Overlay Two case: for discrete objects and for fields Discrete object case: find the polygons formed by the intersection of two polygons. There are many related questions, e.g.: do two polygons intersect? Which areas fall in Polygon A but not in Polygon B? The complexity of computing polygon overlays was one of the greatest barriers to the development of vector GIS

31 Polygon Overlay, Discrete Object Case In this example, two polygons are intersected to form 9 new polygons. One is formed from both input polygons; four are formed by Polygon A and not Polygon B; and four are formed by Polygon B and not Polygon A. A B

32 Polygon Overlay, Field Case Two complete layers of polygons are input, representing two classifications of the same area e.g., soil type and land ownership The layers are overlaid, and all intersections are computed creating a new layer each polygon in the new layer has both a soil type and a land ownership the attributes are said to be concatenated The task is often performed in raster

33 Owner X Owner Y Public Polygon overlay, field case A layer representing a field of land ownership (colors) is overlaid on a layer of soil type (layers offset for emphasis). The result after overlay will be a single layer with 5 polygons, each with a land ownership value and a soil type.

34 Spurious or Sliver Polygons In any two such layers there will almost certainly be boundaries that are common to both layers e.g. following rivers The two versions of such boundaries will not be coincident As a result large numbers of small sliver polygons will be created these must somehow be removed this is normally done using a user-defined tolerance

35 Overlay of fields represented as rasters AB The two input data sets are maps of (A) travel time from the urban area shown in black, and (B) county (red indicates County X, white indicates County Y). The output map identifies travel time to areas in County Y only, and might be used to compute average travel time to points in that county in a subsequent step.

36 Spatial Interpolation Values of a field have been measured at a number of sample points There is a need to estimate the complete field to estimate values at points where the field was not measured to create a contour map by drawing isolines between the data points Methods of spatial interpolation are designed to solve this problem

37 Spatial Interpolation Thiessen polygons (define individual areas of influence around each of a set of points. They are polygons whose boundaries define the area that is closest to each point relative to all other points, defined by the perpendicular bisectors of the lines between all points.

38

39 Inverse Distance Weighting (IDW) The unknown value of a field at a point is estimated by taking an average over the known values weighting each known value by its distance from the point, giving greatest weight to the nearest points an implementation of Tobler’s Law

40 point i known value z i location x i weight w i distance d i unknown value (to be interpolated) location x The estimate is a weighted average Weights decline with distance

41 Issues with IDW The range of interpolated values cannot exceed the range of observed values it is important to position sample points to include the extremes of the field this can be very difficult

42 A Potentially Undesirable Characteristic of IDW interpolation This set of six data points clearly suggests a hill profile (dashed line). But in areas where there is little or no data the interpolator will move towards the overall mean (solid line).

43 Kriging A technique of spatial interpolation firmly grounded in geostatistical theory Kriging is based on the assumption that the parameter being interpolated can be treated as a regionalized variable (intermediate between a truly random and a completely deterministic variable) Points near each other have a certain degree of spatial autocorrelation, and points that are widely separate are statistically independent. Kriging is a set of linear regression routines which minimize estimation variance from a predefined covariance model.

44 A semivariogram. Each cross represents a pair of points. The solid circles are obtained by averaging within the ranges or bins of the distance axis. The solid line represents the best fit to these five points, using one of a small number of standard mathematical functions.

45 Stages of Kriging Analyze observed data to estimate a semivariogram Estimate values at unknown points as weighted averages obtaining weights based on the semivariogram the interpolated surface replicates statistical properties of the semivariogram

46 Density Estimation and Potential Spatial interpolation is used to fill the gaps in a field Density estimation creates a field from discrete objects the field’s value at any point is an estimate of the density of discrete objects at that point e.g., estimating a map of population density (a field) from a map of individual people (discrete objects)

47 The Kernel Function Each discrete object is replaced by a mathematical function known as a kernel Kernels are summed to obtain a composite surface of density The smoothness of the resulting field depends on the width of the kernel narrow kernels produce bumpy surfaces wide kernels produce smooth surfaces

48 A typical kernel function The result of applying a 150km- wide kernel to points distributed over California

49 When the kernel width is too small (in this case 16km, using only the S California part of the database) the surface is too rugged, and each point generates its own peak.

50 Other types of spatial analysis Data mining Descriptive summaries Optimization Hypothesis testing

51 Data Mining Analysis of massive data sets in search for patterns, anomalies, and trends spatial analysis applied on a large scale must be semi-automated because of data volumes widely used in practice, e.g. to detect unusual patterns in credit card use

52 Descriptive Summaries Attempt to summarize useful properties of data sets in one or two statistics The mean or average is widely used to summarize data centers are the spatial equivalent there are several ways of defining centers

53 The Centroid Found for a point set by taking the weighted average of coordinates The balance point

54 The Histogram A useful summary of the values of an attribute showing the relative frequencies of different values A histogram view can be linked to other views e.g., click on a bar in the histogram view and objects with attributes in that range are highlighted in a linked map view

55 A histogram or bar graph, showing the relative frequencies of values of a selected attribute. The attribute is the length of street between intersections. Lengths of around 100m are commonest.

56 Spatial Dependence There are many ways of measuring this very important summary property Most methods have been developed for points Patterns can be random, clustered, or dispersed Measures differ for unlabeled and labeled features (e.g. individual house locations, versus housing types)

57 Dispersion A measure of the spread of points around a center (“standard deviation”) Related to the width of the kernel used in density estimation

58 Fragmentation Statistics Measure the patchiness of data sets e.g., of vegetation cover in an area Useful in landscape ecology, because of the importance of habitat fragmentation in determining the success of animal and bird populations populations are less likely to survive in highly fragmented landscapes

59 Three images of part of the state of Rondonia in Brazil, for 1975, 1986, and 1992. Note the increasing fragmentation of the natural habitat as a result of settlement. Such fragmentation can adversely affect the success of wildlife populations.

60 Optimization Spatial analysis can be used to solve many problems of design or create improved design (minimizing distance traveled or construction costs, maximizing profit) A spatial decision support system (SDSS) is an adaptation of GIS aimed at solving a particular design problem

61 Optimizing Point Locations The minimum aggregate travel (MAT) is a simple case: one location and the goal of minimizing total distance traveled to get there The operator of a chain of convenience stores (e.g. Seven Eleven) might want to solve for many locations at once where are the best locations to add new stores? which existing stores should be dropped?

62 Routing Problems Search for optimum routes among several destinations The traveling salesperson problem find the shortest tour from an origin, through a set of destinations, and back to the origin

63 Routing service technicians for Schindler Elevator. Every day this company’s service crews must visit a different set of locations in Los Angeles. GIS is used to partition the day’s workload among the crews and trucks (color coding) and to optimize the route to minimize time and cost.

64 Optimum Paths Find the best path across a continuous cost surface between defined origin and destination to minimize total cost cost may combine construction, environmental impact, land acquisition, and operating cost used to locate highways, power lines, pipelines requires a raster representation

65 Solution of a least-cost path problem. The white line represents the optimum solution, or path of least total cost, across a friction surface represented as a raster. The area is dominated by a mountain range, and cost is determined by elevation and slope. The best route uses a narrow pass through the range. The blue line results from solving the same problem using a coarser raster.

66 Hypothesis Testing Hypothesis testing is a recognized branch of statistics A sample is analyzed, and inferences are made about the population from which the sample was drawn The sample must normally be drawn randomly and independently from the population

67 Hypothesis Testing with Spatial Data Frequently the data represent all that are available e.g., all of the census tracts of Los Angeles It is consequently difficult to think of such data as a random sample of anything not a random sample of all census tracts Tobler’s Law guarantees that independence is problematic unless samples are drawn very far apart

68 Possible Approaches to Inference Treat the data as one of a very large number of possible spatial arrangements useful for testing for significant spatial patterns Discard data until cases are independent no one likes to discard data Use models that account directly for spatial dependence Be content with descriptions and avoid inference

69 Summary All methods of spatial analysis work best in the context of a collaboration between human and machine. One benefit of the machine is that it sometimes serves to correct any misleading aspects of human intuition. (Human can be poor at guessing the answers to optimization problems in space.)

70


Download ppt "Geographic Information Systems (GIS): Spatial Analysis November 1, 2005."

Similar presentations


Ads by Google