The United Kingdom National Area Classification of Output Areas Daniel Vickers with Phil Rees & Mark Birkin School of Geography, University of Leeds.

The United Kingdom National Area Classification of Output Areas Daniel Vickers with Phil Rees & Mark Birkin School of Geography, University of Leeds

What will I be talking about today? 1.Introduction to Area Classification and Output Areas 2.How the Classification system was made including: What data goes in? Methods of standardisation Issues of cluster number selection Cluster selection Cluster Creation Naming the clusters 3.How well does the classification discriminate: Census data Comparing the Core cities Voting patterns Deconstructing Rural England 4.Mapping the Classification Focus on Leeds Focus on Fife A look around the country 2

What is an Area Classification? A segmentation system which groups similar neighbourhoods into categories, based on the characteristics of their residents: a simplification of complex datasets. What is an Output Area? The smallest area for census output 223, 060 in the UK E&W 174,434 min size 40 hholds 100 people Scotland 42,604 min size 20 hholds 50 people NI 5,022 min size 40 hholds 100 people 3

What Goes In? 41 Census Variables covering: Demographic attributes Including - age, ethnicity, country of birth and population density Household composition Including - living arrangements, family type and family size. Housing characteristics Including - tenure, type & size, and quality/overcrowding Socio-economic traits Including - education, socio-economic class, car ownership & commuting and health & care. Employment attributes Including - level of economic activity and employment class type. How many data inputs are involved? 223,060 Output Areas, 41 Variables = 9,145,460 data points 4

Standardising the Data Log Transformation Reduces the effect of extreme values (outliers) Why? Range standardisation between 0 -1 Problems will occur if there are differing scales or magnitudes among the variables. In general, variables with larger values and greater variation will have more impact on the final similarity measure. It is therefore necessary to make each variable equally represented in the distance measure by standardising the data. Why? 5

Issues of Cluster Number Selection When choosing the number of clusters to have in the classification there were three main issues which need to be considered. Issue 1: Analysis of average distance from cluster centres for each cluster number option. The ideal solution would be the number of clusters which gives smallest average distance from the cluster centre across all clusters. Issue 2: Analysis of cluster size homogeneity for each cluster number option. It would be useful, where possible, to have clusters of as similar size as possible in terms of the number of members within each. 6

Issues of Cluster Number Selection Issue 3: The number of clusters produced should be as close to the perceived ideal as possible. This means that the number of clusters needs to be of a size that is useful for further analysis. “At the highest level of aggregation, the cluster groups should be about 6 in number to enable good visualisation and these clusters should also be given descriptive names. At the next level of aggregation, the number of groups should be about 20. This would be good for conceptual customer profiling. At the next level of aggregation, the number of groups should be about 50. This can be used for market propensity measures from the larger commercial surveys.” (Personal Communication 2003, from Martin Callingham, Independent Market Research Consultant and Birkbeck College, co-editor of Qualitative Market Research: Principle and Practice, Sage, 2003) 7

Cluster Selection A three tier hierarchy 7, 21 & 52 clusters First Level target 6, 7 selected based on analysis of, average distance from cluster centre and size of each cluster. Second Level target 20, 21 selected based on analysis of, average distance from cluster centre and size of each cluster. Third Level target 50, 52 selected based on size of each cluster. Split into either 2 or 3 groups 8

Cluster Creation UK Database 223,060 OAs K-means algorithm SPSS, 7 Super Groups 1234 K-means algorithm SPSS, 3 Groups 4a4b K-means algorithm SPSS, 3 Sub Groups 4b14b24b3 4c 567 Modified K-means clustering First level run as standard k-means Second level, first level is split into separate files and each file is clustered separately Third level, second level is split into separate files and each file is clustered separately 9

Cluster Creation UK 1 1a 1a11a2 1b 1b11b2 2 2a 2a12a2 2b 2b12b22b3 2c 2c12c22c3 2d 2d12d2 3 3a 3a13a23a3 3b 3b13b23b3 4 4a 4a14a24a3 4b 4b14b24b3 4c 4c14c2 5 5a 5a15a2 5b 5b15b2 5c 5c15c2 6 6a 6a16a2 6b 6b16b26b3 6c 6c16c26c3 7 7a 7a17a27a3 7b 7b17b2 7c 7c17c27c3 7d 7d17d2 10

Naming the Clusters 1: City Centre Melting Pot 1a: Younger Metropolitan Dwellers 1b: Older Metropolitan Dwellers 2: Typical Traits 2a: Transitional Neighbourhoods 2b: Settled Families 2c: Established Metropolitan Hinterland 2d: Young Terraced Families 3: Inner City Multicultural Blend 3a: Afro-Caribbean Communities 3b: Asian Influence 4: Blue Collar Communities 4a: Terraced Routine Workers 4b: Older Routine Workers 4c: Young Families, Routine Workers 5: Idyllic Countryside 5a: Agricultural UK 5b: Retired to the Countryside 5c: Families in the Countryside 6: Constraints of Circumstance 6a: Families of Hardship 6b: Assisted Existence 6c: Older Hard Fortune 7: Comfortable Suburban Estates 7a: Opulent Older Families 7b: Suburban Transition 7c: Suburban Melting Pot 7d: Young Suburban Families The naming of the clusters is a near impossible task and one that always provokes much debate. However, the task is very important, as if it is done wrongly it can create a false impression of the people within a cluster. The naming must follow two general principles: 1. Must not offend residents 2. Must not contradict other classifications or use already established names. 11

How Well Does It Discriminate? City Centre Melting Pot Typical Traits Inner City Multicultural Blend Blue Collar Communities Idyllic Countryside Constraints of Circumstance Comfortable Suburban Estates 12

Comparing the Core Cities (and Fife) 16

Who do Each Type Vote for? 2001 Election Data courtesy of Ed Fieldhouse, CCSR, University of Manchester 17

Deconstructing Rural England (Devon case study) Percentage Super Group 5 Idyllic Countryside 3 – 16 16 – 39 39 – 51 51 – 74 Devon Average 31% UK Average 12.5% 18

Focus On Leeds City Centre Melting Pot Typical Traits Inner City Multicultural Blend Blue Collar Communities Idyllic Countryside Constraints of Circumstance Comfortable Suburban Estates Boundaries: Community Areas, as defined by Pete Shepherd, School of Geography, University of Leeds (built from Output Areas) Map appears in forthcoming book “Twenty-First Century Leeds: Geographies of a Regional City” edited by Rachael Unsworth & John Stillwell 19

Focus on Fife City Centre Melting Pot Typical Traits Inner City Multicultural Blend Blue Collar Communities Idyllic Countryside Constraints of Circumstance Comfortable Suburban Estates 20

Focus on Fife Total number of OAs in Fife: 2882 City Centre Melting Pot Typical Traits Inner City Multicultural Blend Blue Collar Communities Idyllic Countryside Constraints of Circumstance Comfortable Suburban Estates 5.4% 8.9% 0% 25.8% 8.6% 31.6% 19.7% 21

Focus on Fife Total number of OAs in Fife: 2882 16.8% 22

Focus on Fife Total number of OAs in Fife: 2882 11.7% 23

Consultation 55 respondents so far, 29 Academics, 26 Local Government Two most confused types: 4 Blue Collar Communities & 6 Constraints of Circumstance Easiest type to identify: 5 Idyllic Countryside 24

Where would you like to go? Belfast Birmingham Bradford Bristol Cardiff Dundee Edinburgh Glasgow Liverpool London Manchester Newcastle Norwich Nottingham Southampton St-Andrews Thank you for listening Any Questions? 25

The United Kingdom National Area Classification of Output Areas Daniel Vickers with Phil Rees & Mark Birkin School of Geography, University of Leeds.

Similar presentations

Presentation on theme: "The United Kingdom National Area Classification of Output Areas Daniel Vickers with Phil Rees & Mark Birkin School of Geography, University of Leeds."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

The United Kingdom National Area Classification of Output Areas Daniel Vickers with Phil Rees & Mark Birkin School of Geography, University of Leeds.

Similar presentations

Presentation on theme: "The United Kingdom National Area Classification of Output Areas Daniel Vickers with Phil Rees & Mark Birkin School of Geography, University of Leeds."— Presentation transcript:

Similar presentations

About project

Feedback