A Comparison of Zonal Smoothing Techniques Prof. Chris Brunsdon Dept. of Geography University of Leicester

A Comparison of Zonal Smoothing Techniques Prof. Chris Brunsdon Dept. of Geography University of Leicester cb179@leicester.ac.uk

Background Much social science data comes aggregated over irregular spatial zones Census Wards Police beat zones Neighbourhood renewal areas CDRP Special Areas

Typical Problems Changing from one set of geographical units to another Areas of special concern for crime reduction (not the aggregation units used to report crime rates) Compare crime rates with social data (different aggregation units) One solution Convert to surface - re-aggregate to new zones

Factors to Consider Data Collection Statistical Issues Software Issues Underlying Theory Diagnostics Organisational Issues

Background (1) CAMSTATS web site Developed at UCL as a consultancy (Muki Hacklay) Gives public access to crime data - going back to April 2000 Designed so that police officers (or civilians) can update web page in a single button click Has run without problems or need for advice or intervention

Background (2) Crime rates are mapped for a number of areal units Wards Police Sectors Neighbourhood Renewal Areas Special Areas

Approaches Roughness Penalty Pycnophylactic Interpolation Naive Averaging

Form of Problem Estimate an underlying crime risk surface from zonal data Continuous version of model: In some approaches only

Discrete Approximation: This is an over specified regression model. NB - error term only in some approaches

Over-Specified? What does this mean? More variables than observations Solution is not unique ie - for a given zone set all pixels to zero, and set one to crime count set all pixels to 1/n of crime count if n is number of pixels in region

A Discrete roughness penalty Rougness Penalty In fact there are an infinite number of solutions to equation on earlier slide Favour those with a lower roughness penalty c.f. regularization problems Aim to minimise sum of squared errors + const. x roughness Roughness at

This Can be solved by matrix algebra Contains info relating pixels to zones Encapsulates ‘total roughness’ for all pixels Controls roughness penalty Observed zonal count X is an indicator matrix showing which pixel is in which zone

Software Techniques here are not ‘off the shelf’ Statistical/numerical as well as GIS techniques Here the ‘R’ package used Statistical programming language Good graphical support Open Source (with lots of libraries - including GIS- type support)

Pycnophylactic Interpolation Similar to Roughness Penalty - but no errors allowed - cf Tobler 1979 Can be solved as a quadratic programming problem

Naive Approach Assume that the density within each areal unit is constant

HOUSING DENSITY: Is it sensible to assume intensity of household burglaries is smooth?

Model Modification Densities can be obtained with David Martin’s SURPOP approach - can apply this modification to all approaches described earlier

Routine activity Theory We now assume risk per household is smooth Perhaps in line with Cohen & Felson’s ROUTINE ACTIVITY THEORY? Offenders choose targets according to their usual movement patterns Familiary with a pixel suggests familiarity with its neighbours But potential targets have to be there as well!

Evaluation Camstats web site (www.met.police.uk/camden/camstats)www.met.police.uk/camden/camstats Monthly household burglary rates from April 2003 to March 2006 Aggregated over a number of different zones Models are calibrated by UK census wards (64x64 pixels) Then tested against two special interest areas Camden Town / King’s Cross

Results MethodKing’s CrossCamden Town Pycnophylactic (HH)1.942.90 Pycnophylactic1.603.13 Naive (HH)1.263.13 Naive1.373.48 Roughness (HH)2.052.86 Roughness1.653.04 Numbers are mean absolute deviations in estimated burglary counts - lowest in red, runner up in green

Discussion Is simplest best? Further findings show simple estimators work best on areas close to the edge of the region, but smoothing based approaches work best further inside the region

Camden Isn’t An ISLAND!

Consequences Smoothing based approaches ‘borrow information’ from nearby places cf Toblers First Law of Geography: Everything is related to everything else, but near things are more related than distant things Because Camden isn’t an island, things are going on beyond the ‘edges’. But we don’t know what they are! So we can’t reliably borrow information So probably simpler methods perform better near the ‘edges’

A real-world problem In practice organisations sub-divide data geographically But without data sharing, individual regions appear (at least mathematically) as islands!

Conclusions - Further Work ? For Camden Town, Roughness Penalty performed best. For King’s Cross, the Naive method worked best In both cases, taking household density into account proved best Edge effects? Merging predictors? Further work - kernel based approaches...

A Comparison of Zonal Smoothing Techniques Prof. Chris Brunsdon Dept. of Geography University of Leicester

Similar presentations

Presentation on theme: "A Comparison of Zonal Smoothing Techniques Prof. Chris Brunsdon Dept. of Geography University of Leicester"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

A Comparison of Zonal Smoothing Techniques Prof. Chris Brunsdon Dept. of Geography University of Leicester

Similar presentations

Presentation on theme: "A Comparison of Zonal Smoothing Techniques Prof. Chris Brunsdon Dept. of Geography University of Leicester"— Presentation transcript:

Similar presentations

About project

Feedback