Www.spatialanalysisonline.com Chapter 5 Part A: Spatial data exploration.

Slides:



Advertisements
Similar presentations
Chapter 8 Geocomputation Part A:
Advertisements

October 1, 2008www.Connotative.com1 Commercializing Access to the Parallel Universe of Connotative Meaning.
Fundamentals Fundamentals of Thermal Conductivity Measurement via ASTM 5470 by Dr. John W. Sofia Analysis Tech Inc
Methodological context
Chapter 7 Part B: Locational analysis.
Chapter 2 Conceptual frameworks for spatial analysis.
Chapter 7 Part A: Network analysis.
Part B: Spatial Autocorrelation and regression modelling
Chapter 4 Part C: Queries, Computations & Map Algebra.
Chapter 4 Part A: Geometric & related operations.
Chapter 4 Part B: Distance and directional operations.
Chapter 8 Geocomputation Part B:
Chapter 6 Part A: Surface analysis – geometrical methods.
Sampling Design, Spatial Allocation, and Proposed Analyses Don Stevens Department of Statistics Oregon State University.
Spatial point patterns and Geostatistics an introduction
Credit Card Operations Bülent Şenver
October 2002www.qimpro.com1 SIX SIGMA BLACK BELT Summary of Steps.
Use of EVDAS for monitoring purposes Piotr Nowicki, MD Warsaw, 06-Oct-2011.
Original Figures for "Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring"
PARTITIONAL CLUSTERING
Mutual Information Mathematical Biology Seminar
More Raster and Surface Analysis in Spatial Analyst
Introduction to Mapping Sciences: Lecture #5 (Form and Structure) Form and Structure Describing primary and secondary spatial elements Explanation of spatial.
Why Geography is important.
1 An Introduction to Nonparametric Regression Ning Li March 15 th, 2004 Biostatistics 277.
1 Spatial Statistics and Analysis Methods (for GEOG 104 class). Provided by Dr. An Li, San Diego State University.
Ch 5 Practical Point Pattern Analysis Spatial Stats & Data Analysis by Magdaléna Dohnalová.
Geographic Information Science
University of Wisconsin-Milwaukee Geographic Information Science Geography 625 Intermediate Geographic Information Science Instructor: Changshan Wu Department.
Overview G. Jogesh Babu. Probability theory Probability is all about flip of a coin Conditional probability & Bayes theorem (Bayesian analysis) Expectation,
Spatial Statistics Applied to point data.
Spatial Statistics in Ecology: Area Data Lecture Four.
Lecture 20: Cluster Validation
Model Construction: interpolation techniques 1392.
Sampling Populations Ideal situation - Perfect knowledge Not possible in many cases - Size & cost Not necessary - appropriate subset  adequate estimates.
Chapter 4 – Distance methods
Data Types Entities and fields can be transformed to the other type Vectors compared to rasters.
Applications of Spatial Statistics in Ecology Introduction.
Extent and Mask Extent of original data Extent of analysis area Mask – areas of interest Remember all rasters are rectangles.
Spatial Statistics in Ecology: Point Pattern Analysis Lecture Two.
1 Spatial Statistics and Analysis Methods (for GEOG 104 class). Provided by Dr. An Li, San Diego State University.
What’s the Point? Working with 0-D Spatial Data in ArcGIS
Point Pattern Analysis Point Patterns fall between the two extremes, highly clustered and highly dispersed. Most tests of point patterns compare the observed.
So, what’s the “point” to all of this?….
Local Spatial Statistics Local statistics are developed to measure dependence in only a portion of the area. They measure the association between Xi and.
L15 – Spatial Interpolation – Part 1 Chapter 12. INTERPOLATION Procedure to predict values of attributes at unsampled points Why? Can’t measure all locations:
Spatial Statistics and Analysis Methods (for GEOG 104 class).
Analyzing Expression Data: Clustering and Stats Chapter 16.
Methods for point patterns. Methods consider first-order effects (e.g., changes in mean values [intensity] over space) or second-order effects (e.g.,
Point Pattern Analysis
Exploratory Spatial Data Analysis (ESDA) Analysis through Visualization.
Technical Details of Network Assessment Methodology: Concentration Estimation Uncertainty Area of Station Sampling Zone Population in Station Sampling.
Spatial Point Processes Eric Feigelson Institut d’Astrophysique April 2014.
CZ5211 Topics in Computational Biology Lecture 4: Clustering Analysis for Microarray Data II Prof. Chen Yu Zong Tel:
Technical Details of Network Assessment Methodology: Concentration Estimation Uncertainty Area of Station Sampling Zone Population in Station Sampling.
METU, GGIT 538 CHAPTER V MODELING OF POINT PATTERNS.
INTERPOLATION Procedure to predict values of attributes at unsampled points within the region sampled Why?Examples: -Can not measure all locations: - temperature.
DATA MINING: CLUSTER ANALYSIS (3) Instructor: Dr. Chun Yu School of Statistics Jiangxi University of Finance and Economics Fall 2015.
Synthesis.
Cases and controls A case is an individual with a disease, whose location can be represented by a point on the map (red dot). In this table we examine.
Introduction to Spatial Statistical Analysis
Spatial analysis Measurements - Points: centroid, clustering, density
Quantifying Scale and Pattern Lecture 7 February 15, 2005
Chapter 5 Part B: Spatial Autocorrelation and regression modelling.
Summary of Prev. Lecture
K Nearest Neighbor Classification
Spatial Point Pattern Analysis
CSE572, CBS572: Data Mining by H. Liu
Topological Signatures For Fast Mobility Analysis
CSE572: Data Mining by H. Liu
Presentation transcript:

Chapter 5 Part A: Spatial data exploration

3 rd editionwww.spatialanalysisonline.com2 Spatial data exploration Spatial analysis and data models (Anselin, 2002) ObjectField GISvectorraster Spatial Datapoints, lines, polygonssurfaces Locationdiscretecontinuous Observationsprocess realisationsample Spatial Arrangementspatial weightsdistance function Statistical Analysislatticegeostatistics Predictionextrapolationinterpolation Modelslag and errorerror Asymptoticsexpanding domaininfill

3 rd editionwww.spatialanalysisonline.com3 Spatial data exploration Sampling frameworks Pure random sampling Stratified random – by class/strata (proportionate, disproportionate) Randomised within defined grids Uniform Uniform with randomised offsets Sampling and declustering

3 rd editionwww.spatialanalysisonline.com4 Spatial data exploration Sampling frameworks – point sampling

3 rd editionwww.spatialanalysisonline.com5 Spatial data exploration Sampling frameworks – within zones Selection of 5 random points per zone Grid generation - square grid within field boundaries Grid generation (hexagonal) - selection of 1 point per cell, random offset from centre

3 rd editionwww.spatialanalysisonline.com6 Spatial data exploration A. 10% random sample from existing point setB. Stratified random selection, 30% of each stratum 800 radio-activity monitoring sites in Germany. Random sample of 80 (red/large dots) 200 radio-activity monitoring sites in Germany. Random sample of 30 (red/large dots) =100 units of radiation

3 rd editionwww.spatialanalysisonline.com7 Spatial data exploration Random points on a network

3 rd editionwww.spatialanalysisonline.com8 Spatial data exploration EDA, ESDA and ESTDA EDA – basic aims (after NIST) maximize insight into a data set uncover underlying structure extract important variables detect outliers and anomalies test underlying assumptions develop parsimonious models determine optimal factor settings

3 rd editionwww.spatialanalysisonline.com9 Spatial data exploration ESDA (see GeoDa and STARS) Extending EDA ideas to the spatial domain (lattice/zone models) Brushing Linking Mapped histograms Outlier mapping Box plots Conditional choropleth plots Rate mapping

3 rd editionwww.spatialanalysisonline.com10 Spatial data exploration ESDA: Brushing & linking

3 rd editionwww.spatialanalysisonline.com11 Spatial data exploration ESDA: Histogram linkage

3 rd editionwww.spatialanalysisonline.com12 Spatial data exploration ESDA: Parallel coordinate plot & star plot

3 rd editionwww.spatialanalysisonline.com13 Spatial data exploration ESDA: Mapped box plots

3 rd editionwww.spatialanalysisonline.com14 Spatial data exploration ESDA: Conditional choropleth mapping

3 rd editionwww.spatialanalysisonline.com15 Spatial data exploration ESDA: Mapped point data A. Variable point sizeB. Variable colourC. Semivariogram pairsD. Voronoi analysis

3 rd editionwww.spatialanalysisonline.com16 Spatial data exploration ESDA: Trend analysis (continuous spatial data)

3 rd editionwww.spatialanalysisonline.com17 Spatial data exploration ESDA: Cluster hunting – GAM/K (steps) 1.Read data for the population at risk 2.Identify the MBR containing the data, identify starting circle radius, and degree of overlap 3.Generate a grid covering the MBR 4.For each grid-intersection generate a circle of radius r 5.Retrieve two counts for the population at risk and the variable of interest 6.Apply some significance test procedure 7.Keep the result if significant 8.Repeat Steps 5 to 7 until all circles have been processed 9.Increase circle radius by dr and return to Step 3 else go to Step Create a smoothed density surface of excess incidence for the significant circles 11.Map this surface and inspect the results

3 rd editionwww.spatialanalysisonline.com18 Spatial data exploration Grid-based statistics Univariate analysis of attribute data (non- spatial metrics) Cross-classification and cross-tab analyses Spatial pattern analysis for grid data (including Landscape metrics) Patch metrics; Class-level metrics; Landscape- level metrics Quadrat analysis Multi-grid regression analysis

3 rd editionwww.spatialanalysisonline.com19 Spatial data exploration Grid-based statistics Landscape metrics Non-spatial Proportional abundance; Richness; Evenness; Diversity Spatial Patch size distribution and density; Patch shape complexity; Core Area; Isolation/Proximity; Contrast; Dispersion; Contagion and Interspersion; Subdivision; Connectivity

3 rd editionwww.spatialanalysisonline.com20 Spatial data exploration Point (event) based statistics Typically analysis of point-pair distances Points vs events Distance metrics: Euclidean, spherical, L p or network Weighted or unweighted events Events, NOT computed points (e.g. centroids) Classical statistical models vs Monte Carlo and other computational methods

3 rd editionwww.spatialanalysisonline.com21 Spatial data exploration Point (event) based statistics Basic Nearest neighbour (NN) model Input coordinates of all points Compute (symmetric) distances matrix D Sort the distances to identify the 1st, 2nd,...kth nearest values Compute the mean of the observed 1st, 2nd,...kth nearest values Compare this mean with the expected mean under Complete Spatial Randomness (CSR or Poisson) model

3 rd editionwww.spatialanalysisonline.com22 Spatial data exploration Point (event) based statistics – NN model

3 rd editionwww.spatialanalysisonline.com23 Spatial data exploration Point (event) based statistics – NN model Mean NN distance: Variance: NN Index (Ratio): Z-transform:

3 rd editionwww.spatialanalysisonline.com24 Spatial data exploration Point (event) based statistics Issues Are observations n discrete points? Sample size (esp. for k th order NN, k>1) Model requires density estimation, m Boundary definition problems (density and edge effects) – affects all methods NN reflexivity of point sets Limited use of frequency distribution Validity of Poisson model vs alternative models

3 rd editionwww.spatialanalysisonline.com25 Spatial data exploration Frequency distribution of nearest neighbour distances, i.e. The frequency of NN distances in distance bands, say 0-1km, 1-2kms, etc The cumulative frequency distribution is usually denoted G(d) = #(d i < r)/nwhere d i are the NN distances and n is the number of measurements, or F(d) = #(d i < r)/mwhere m is the number of random points used in sampling

3 rd editionwww.spatialanalysisonline.com26 Spatial data exploration Computing G(d) [computing F(d) is similar] Find all the NN distances Rank them and form the cumulative frequency distribution Compare to expected cumulative frequency distribution: Similar in concept to K-S test with quadrat model, but compute the critical values by simulation rather than table lookup

3 rd editionwww.spatialanalysisonline.com27 Spatial data exploration Point (event) based statistics – clustering (ESDA) Is the observed clustering due to natural background variation in the population from which the events arise? Over what spatial scales does clustering occur? Are clusters a reflection of regional variations in underlying variables? Are clusters associated with some feature of interest, such as a refinery, waste disposal site or nuclear plant? Are clusters simply spatial or are they spatio-temporal?

3 rd editionwww.spatialanalysisonline.com28 Spatial data exploration Point (event) based statistics – clustering k th order NN analysis Cumulative distance frequency distribution, G(r) Ripley K (or L) function – single or dual pattern PCP Hot spot and cluster analysis methods

3 rd editionwww.spatialanalysisonline.com29 Spatial data exploration Point (event) based statistics – Ripley K or L Construct a circle, radius d, around each point (event), i Count the number of other events, labelled j, that fall inside this circle Repeat these first two stages for all points i, and then sum the results Increment d by a small fixed amount Repeat the computation, giving values of K(d) for a set of distances, d Adjust to provide normalised measure L:

3 rd editionwww.spatialanalysisonline.com30 Spatial data exploration Point (event) based statistics – Ripley K

3 rd editionwww.spatialanalysisonline.com31 Spatial data exploration Point (event) based statistics – comments CSR vs PCP vs other models Data: location, time, attributes, error, duplicates Duplicates: deliberate rounding, data resolution, genuine duplicate locations, agreed surrogate locations, deliberate data modification Multi-approach analysis is beneficial Methods: choice of methods and parameters Other factors: borders, areas, metrics, background variation, temporal variation, non-spatial factors Rare events and small samples Process-pattern vs cause-effect ESDA in most instances

3 rd editionwww.spatialanalysisonline.com32 Spatial data exploration Hot spot and cluster analysis – questions where are the main (most intensive) clusters located? are clusters distinct or do they merge into one another? are clusters associated with some known background variable? is there a common size to clusters or are they variable in size? do clusters themselves cluster into higher order groupings? if comparable data are mapped over time, do the clusters remain stable or do they move and/or disappear?

3 rd editionwww.spatialanalysisonline.com33 Spatial data exploration Hot spot (and cool-spot) analysis Visual inspection of mapped patterns Scale issues Proximal and duplicate points Point representation (size) Background variation/controls (risk adjustment) Weighted or unweighted Hierarchical or non-hierarchical Kernel & K-means methods

3 rd editionwww.spatialanalysisonline.com34 Spatial data exploration Hot spot analysis – Hierarchical NN Cancer incidence data 1 st and 2 nd order clusters