Adding Census Geographical Detail into the British Crime Survey for Modelling Crime Charatdao Kongmuang Naresuan University, Thailand Graham Clarke and.
Published byModified over 6 years ago
Presentation on theme: "Adding Census Geographical Detail into the British Crime Survey for Modelling Crime Charatdao Kongmuang Naresuan University, Thailand Graham Clarke and."— Presentation transcript:
Adding Census Geographical Detail into the British Crime Survey for Modelling Crime Charatdao Kongmuang Naresuan University, Thailand Graham Clarke and Andrew Evans University of Leeds, UK
Background Crime and risk of victimisation Crime and risk of victimisation Unevenly distributed across population, time and space Varies dramatically with demographic, socio-economic and area characteristics Crimes at the small area scale often do not match expectations based on national averages
Background (cont.) Although, the British Crime Survey (BCS) provides rich information about levels of crime and crime victimisation, it cannot be used to explain crime victimisation for small geographical units (not currently released below the national level)
Solution Attach the information from the BCS to the more geographically disaggregated census data using spatial microsimulation technique.
What is Microsimulation? A methodology aimed at building large-scale datasets on individual units such as persons, households or firms and can be used to simulate the effect of changes in policy or other changes on these micro units
What is Spatial Microsimulation? A microsimulation that takes space into account It contains geographical information that can be used to investigate the policy impacts
SimCrime Model A static spatial microsimulation model designed to estimate the likelihood of being a victim of crime and crime rates at the small area level in Leeds.
SimCrime Model (cont.) Simulated Annealing C ombines individual microdata from the British Crime Survey (BCS) with spatially small area aggregated census data to create synthetic microdata estimates for output areas (OAs) in Leeds using a Combinatorial Optimisation Simulated Annealing method.
SimCrime Model Specification ‘Simulated Annealing-Based Reweighting Program’ The synthetic microdata dataset was generated at the census output area for Leeds with the use of ‘Simulated Annealing-Based Reweighting Program’. 514,523 individuals aged 16-74 living in households found in Leeds in the UK 2001 Census were recreated
Simulated Annealing Based Reweighting Program (generate a population microdata dataset at the Output Area Level) Implemented in Java. The process involves selecting the combination of individuals from the BCS microdata which best fits the known constraints in the selected small areas (of the 2001 UK Census). gradually improving fit The process is repeated with the aim of gradually improving fit between the observed data and the selected combination of individual from the BCS.
Data for generating synthetic micro-population: Census Area Statistics of the 2001 UK Census Constrained Tables Number of total population in small-areas Microdata from the 2001 British Crime Survey
Census Area Statistics (CAS) Equivalent to the Small Area Statistics (SAS) of the 1971, 1981, and 1991 Censuses. Available for geographical levels down to output area (OA), the smallest unit of the 2001 Census. Note: Note: Each OA contains approximately 290 persons or 125 households
Category IndicatorHigh Propensity Demographic Characteristics of Offender Age Sex Marital Status Family Status Family Size Young adult Male Single Broken Home, divorce (weak family life) Large Socio-Economic Status of Offender Income Employment status Education Deprivation Low income Unemployed Less High level of deprivation Household Characteristics Density of living Tenure Substandard Rented Victim Characteristics Age Sex Ethnicity Lifestyle Tenure Young adult Male Minority Group Away home Rented, not owner occupied Neighbourhood types and characteristics Urbanisation Population Density Proximity High Inner city, proximity to disadvantage areas Variables related to crime
Constrained Tables: CS004: CS004:Age by Sex and Living Arrangements (16 categories) CS047: CS047: National Statistic-Socioeconomic Classification by Tenure (18 categories) CS061: CS061: Tenure and Car or Van Availability by Economic Activity (24 categories)
SimCrime Constrained Variables Categories Age Aged 16-24 Aged 25-34 Aged 35-49 Aged 50-74 Sex Male Female Living Arrangement Couple Not couple Economic Activity Employed Unemployed Inactive Full-time Student Tenure Type Owned Rented Car or Van availability No Car One Car Two or more car Socio-economic Classification Higher Managerial and professional occupations Lower Managerial and professional occupations Intermediate occupations Small employers and own account workers Lower supervisory and technical occupations Semi-routine occupations Routine occupations Never worked and long-term unemployed Not classified
Discrepancies in census counts between tables Source: 2001 Census Area Statistics Source: 2001 Census Area Statistics Note: Note: Each cell shows the number of people aged 16-74 living in households
The constraint tables should be adjusted to minimise discrepancies between the total populations in small areas.
Constraint Tables Adjustment Total number of people in the small areas (GroupNumber) Each table cell
Number of people in each cell = Number of people from the constraint table x GroupNumber Total Sum for each area Constraint Tables Adjustment (cont.)
What can we get? more consistent The adjustment method ensures the constraint tables are more consistent or at least can be guaranteed to produce the smallest discrepancy.
The British Crime Survey One of the largest social research surveys conducted in England and Wales (Sample 40,000 households) A victimisation survey (whether or not reported to the police) Covers a wide range of topics (1,642 variables) NOT The BCS can now provide limited information at the police force area level, but NOT for smaller geographical units.
Microdata (The 2001 BCS) 1,642 variables with 32,824 records
The Program: The microdata filtering process Goes through the entire micro-database and checks whether an individual fits into each column of constraining tables for the current area. Simulated Annealing process Searches for the best combinations of individuals based on the result of the filtering process.
Output: from the Simulated Annealing Based Reweighting Program Synthetic Population: A list of individuals which contains the demographic and socio-economic characteristics ( crime variables from the BCS are attached). Error Report: Provides information on the difference between distributions of constrained table and synthetic microdata at the output area level.
The absolute differences between estimated & expected counts Error Report
Distribution of female single, widow, or divorce aged 25-49 living in rented house
Distribution of high class households, owner occupier having at least 1 car
Evaluation of Synthetic Microdata Evaluate in terms of their match to the constraint tables from the census at the output area level.
Evaluation of Synthetic Microdata (cont.) Total Absolute Error (TAE) The measure of difference between distributions of constrained table and synthetic microdata is the Total Absolute Error (TAE) The sum of absolute differences between estimated and observed counts. Standardised Absolute Error (SAE) To compare across the tables: Standardised Absolute Error (SAE) TAE / Total expected count
SAE of 0 or perfect fit =1,318 output areas Note: The number in the bracket show number of output area for each SAE group. There are 2,439 output areas in Leeds. Source: SimCrime
Spatial distribution of SAE for all constraints at output area level SAE of 0 or perfect fit =1,212 output areas Note: The number in the bracket show number of output area for each SAE group. There are 2,439 output areas in Leeds. Source: SimCrime
Modelling Crime Each individual in the BCS has crime variables associated with them, the microsimulation allows us to make small area estimates victims of crime and high-risk areas. Assume that if the synthetic population have the same characteristics as the population from the BCS, they will have the same propensity to be a victim of crime.
Headingley University Estimated victim rate per 1,000 households by ward of ‘burglary dwelling’ in Leeds
Conclusion SimCrime effectively adds ‘geography’ to the British Crime Survey The spatial aspect of the data make it possible to do analysis at different spatial scales. Demonstrated a method to minimise discrepancies between the totals of the constraint tables
Conclusion (cont.) small area levels The spatial microsimulation has enabled the modelling of crime victimisation at small area levels. Before this the smallest area of modelling crime in the UK was at the police force area level.
More information Modelling Crime: A Spatial Microsimulation Approach (Completed PhD thesis) (Completed PhD thesis) http://www.geog.leeds.ac.uk/people/old/c.kongmuang/ http://www.geog.leeds.ac.uk/people/old/c.kongmuang/ http://www.geog.leeds.ac.uk/people/old/c.kongmuang/http://www.geog.leeds.ac.uk/people/old/c.kongmuang/ SimCrime: A Spatial Microsimulation for Crime in Leeds (Working Paper 06/1) http://www.geog.leeds.ac.uk/wpapers/index.html http://www.geog.leeds.ac.uk/wpapers/index.html Email: firstname.lastname@example.org email@example.com@gmail.com Dept. Natural Resources and Environment Fac. of Agriculture, Natural Resources and Environment Naresuan University, Muang, Phitsanulok, 65000, THAILAND