Exploring Microsimulation Methodologies for the Estimation of Household Attributes Dimitris Ballas, Graham Clarke, and Ian Turton School of Geography University.

Slides:



Advertisements
Similar presentations
Estimating Identification Risks for Microdata Jerome P. Reiter Institute of Statistics and Decision Sciences Duke University, Durham NC, USA.
Advertisements

School of Geography FACULTY OF ENVIRONMENT Spatial Microsimulation for City Modelling, Social Forecasting and Urban Policy Analysis Mark Birkin
User views Jo Wathan SARs Support team
1.2.4 Statistical Methods in Poverty Estimation 1 MEASUREMENT AND POVERTY MAPPING UPA Package 1, Module 2.
Modelling Crime: A Spatial Microsimulation Approach Charatdao Kongmuang School of Geography University of Leeds Supervisors Dr. Graham Clarke, Dr. Andrew.
1 JD Hunt JE Abraham JAS Khan University of Calgary Workshop of the Land Use Transport (LUT) Modelling Group London, UK 2 July 2005 Calgary Testbed for.
Sample of Anonymised Records: User Meeting Propensity to migrate by ethnic group: 1991 & 2001 Paul Norman 1, John Stillwell 2 & Serena Hussain 2 School.
Access routes to 2001 UK Census Microdata: Issues and Solutions Jo Wathan SARs support Unit, CCSR University of Manchester, UK
Chapter 7 Sampling Distributions
Adding Census Geographical Detail into the British Crime Survey for Modelling Crime Charatdao Kongmuang Naresuan University, Thailand Graham Clarke and.
Stirling March 24 ’09 A combinatorial optimisation approach to non-market environmental benefit aggregation via simulated populations Stephen Hynes, Nick.
Spatial Interpolation
School of something FACULTY OF OTHER School of Geography FACULTY OF ENVIRONMENT Modelling Individual Consumer Behaviour
Introduction to Simulation. What is simulation? A simulation is the imitation of the operation of a real-world system over time. It involves the generation.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 8-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter.
Creating synthetic sub-regional baseline populations Dr Paul Williamson Dept. of Geography University of Liverpool Collaborators: Robert Tanton (NATSEM,
Individual and Household Level Estimates Based on 2001 UK Human Population Census Data Andy Turner CSAP Seminar on Microsimulation: Problems and Solutions.
Chapter 14 Simulation. Monte Carlo Process Statistical Analysis of Simulation Results Verification of the Simulation Model Computer Simulation with Excel.
CCG 1 MoSeS Introduction and Progress Report Andy Turner
David Martin Department of Geography University of Southampton 2001 Census: the emergence of a new geographical framework.
Synthetic data, microsimulation and socio-demographic forecasting Mark Birkin School of Geography Professor of Spatial Analysis and Policy.
Statistics and Data for Marketing Data Library, Rutherford North 1 st Floor Chuck Humphrey Data Library October 27, 2008.
EAS 293 Data Library, Rutherford North 1 st Floor Chuck Humphrey Data Library October 14, 2008.
Access to UK Census Data for Spatial Analysis: Towards an Integrated Census Support Service John Stillwell 1, Justin Hayes 2, Rob Dymond-Green 2, James.
Spatial Simulation for Education Policy Analysis in Ireland An Initial Exploration Gillian Golden University College Dublin
Chapter 10 Hypothesis Testing
Chapter 8 Introduction to Hypothesis Testing
SPATIAL MICROSIMULATION: A METHOD FOR SMALL AREA LEVEL ESTIMATION Dr Karyn Morrissey Department of Geography and Planning University of Liverpool Research.
A Synthetic Population Generator that Matches Both Household and Person Attribute Distributions Xin Ye, Ram M. Pendyala, Karthik C. Konduri, Bhargava Sana.
U.S. Decennial Census Finding and Accessing Data Summer Durrant October 20, 2014 Data & Geographical Information Librarian Research Data Services
1 Modifiable Attribute Cell Problem in Population Synthesis for Land-Use Microsimulation Noriko Otani (Tokyo City University) Nao Sugiki (Docon Co., Ltd.)
MGT-491 QUANTITATIVE ANALYSIS AND RESEARCH FOR MANAGEMENT OSMAN BIN SAIF Session 19.
School of Geography FACULTY OF ENVIRONMENT Microsimulation: population reconstruction and an application for household water demand modelling Title ESRC.
Exploring Metropolitan Dynamics with an Agent- Based Model Calibrated using Social Network Data Nick Malleson & Mark Birkin School of Geography, University.
Individual values of X Frequency How many individuals   Distribution of a population.
Record matching for census purposes in the Netherlands Eric Schulte Nordholt Senior researcher and project leader of the Census Statistics Netherlands.
JOINT UNECE-UNFPA TRAINING WORKSHOP ON POPULATION AND HOUSING CENSUSES GENEVA, 5-6 JULY 2010 GOOD PRACTICES IN DISSEMINATING POPULATION CENSUS RESULTS.
Optimal Allocation in the Multi-way Stratification Design for Business Surveys (*) Paolo Righi, Piero Demetrio Falorsi 
Plans for Access to UK Microdata from 2011 Census Emma White Office for National Statistics 24 May 2012.
Comparisons of Synthetic Populations Generated From Census 2000 and American Community Survey (ACS) Public Use Microdata Sample (PUMS) 13 th TRB Application.
What is … small area estimation Dimitris Ballas Department of Geography University of Sheffield
1 Assessing the Impact of SDC Methods on Census Frequency Tables Natalie Shlomo Southampton Statistical Sciences Research Institute University of Southampton.
Monte Carlo Methods Versatile methods for analyzing the behavior of some activity, plan or process that involves uncertainty.
Audit Sampling: An Overview and Application to Tests of Controls
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 1-1 Statistics for Managers Using Microsoft ® Excel 4 th Edition Chapter.
HELIOS: Household Employment and Land Impact Outcomes Simulator FLORIDA STATEWIDE IMPLEMENTATION Development & Application Stephen Lawe RSG Michael Doherty.
Introduction to Spatial Microsimulation Dr Kirk Harland.
Simulation is the process of studying the behavior of a real system by using a model that replicates the behavior of the system under different scenarios.
SAMPLING TECHNIQUES. Definitions Statistical inference: is a conclusion concerning a population of observations (or units) made on the bases of the results.
Evaluating Transportation Impacts of Forecast Demographic Scenarios Using Population Synthesis and Data Simulation Joshua Auld Kouros Mohammadian Taha.
Monté Carlo Simulation  Understand the concept of Monté Carlo Simulation  Learn how to use Monté Carlo Simulation to make good decisions  Learn how.
Modeling and Forecasting Household and Person Level Control Input Data for Advance Travel Demand Modeling Presentation at 14 th TRB Planning Applications.
WP 19 Assessment of Statistical Disclosure Control Methods for the 2001 UK Census Natalie Shlomo University of Southampton Office for National Statistics.
The micro-geography of UK demographic change Paul Norman School of Geography, University of Leeds Understanding Population Trends & Processes.
United Nations Workshop on Revision 3 of Principles and Recommendations for Population and Housing Censuses and Evaluation of Census Data, Amman 19 – 23.
Machine Design Under Uncertainty. Outline Uncertainty in mechanical components Why consider uncertainty Basics of uncertainty Uncertainty analysis for.
IMPORTANCE OF STATISTICS MR.CHITHRAVEL.V ASST.PROFESSOR ACN.
1 SMU EMIS 7364 NTU TO-570-N Control Charts Basic Concepts and Mathematical Basis Updated: 3/2/04 Statistical Quality Control Dr. Jerrell T. Stracener,
URBDP 591 A Lecture 16: Research Validity and Replication Objectives Guidelines for Writing Final Paper Statistical Conclusion Validity Montecarlo Simulation/Randomization.
Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson0-1 Supplement 2: Comparing the two estimators of population variance by simulations.
Numeracy & Quantitative Methods: Level 7 – Advanced Quantitative Analysis.
Bootstrapping James G. Anderson, Ph.D. Purdue University.
Samples of Anonymised Records from the U.K. Census 1991 and 2001 Integrating Census Microdata Workshop Barcelona th July 2005 Dr. Ed Fieldhouse Cathie.
DATA FOR EVIDENCE-BASED POLICY MAKING Dr. Tara Vishwanath, World Bank.
CMS SAS Users Group Conference Learn more about THE POWER TO KNOW ® October 17, 2011 Medicare Payment Standardization Modeling using SAS Enterprise Miner.
11 Measuring Disclosure Risk and Data Utility for Flexible Table Generators Natalie Shlomo, Laszlo Antal, Mark Elliot University of Manchester
Statistical Quality Control, 7th Edition by Douglas C. Montgomery.
Tabulations and Statistics
Modifiable Attribute Cell Problem and a Method of Solution for Population Synthesis in Land-Use Microsimulation Noriko Otani (Tokyo City University)
University of Minnesota and CEPR
Presentation transcript:

Exploring Microsimulation Methodologies for the Estimation of Household Attributes Dimitris Ballas, Graham Clarke, and Ian Turton School of Geography University of Leeds 4 th International Conference on GeoComputation, Mary Washington College, Fredericksburg, Virginia, USA, July 1999

Exploring Microsimulation Methodologies for the Estimation of Household Attributes n The need for spatially disaggregated microdata n Microsimulation approaches to microdata generation n An object-oriented approach to microsimulation modelling n SimLeeds - 5 different spatial microsimulation modelling approaches to microdata generation

The need for spatially disaggregated microdata n Policy implications - ‘Governments need to predict the outcomes of their actions and produce forecasts at the local level’ (Openshaw, 1995a) n Spatially disaggregated microdata can be extremely useful for regional and social policy analysis n Efficient representation

What is microsimulation? n A technique aiming at building large scale data sets n Modelling at the microscale n A means of modelling real life events by simulating the characteristics and actions of the individual units that make up the system where the events occur

What is microsimulation? An example n sex by age by economic position (1991 UK Census SAS table 08) n level of qualifications by sex (1991 UK Census SAS table 84) n socio-economic group by economic position (1991 UK Census SAS table 92)

What is microsimulation? An example n p(xi,S,A,Q,EP,SEG) given a set of constraints or known probabilities: n p(xi,S,A,EP) n p(xi,Q,S) n p(xi,SEG,EP)

Microsimulation: tenure allocation procedure After After Clarke, G. P. (1996), Microsimulation: an introduction, in G.P. Clarke (ed.), Microsimulation for Urban and Regional Policy Analysis, Pion, London.

Advantages of microsimulation n Data linkage n Spatial flexibility n Efficiency of storage n Ability to update and forecast

Drawbacks of microsimulation n Difficulties in calibrating the model and validating the model outputs n Large requirements of computational power

An object-oriented approach to microsimulation modelling n Lists of households vs. Occupancy matrices n Households can be viewed as objects n A Household class describes the features of all households (e.g. age, sex and marital status of head of household, tenure, etc.) n The Household class contains methods that operate on the instance variables

Microsimulated household attributes (variables of Household) Broad age group of head of household (HH) Sex of HH Marital status of HH Tenure Economic activity of HH Employment status of HH Ethnic group of HH Occupation of HH Industry of HH Educational qualifications of HH Hours worked of HH Former industry of unemployed of HH

Methods of Household GetEcAc(double p) GetEcAcCat(double p[]) GetEthnicity(double[] p) GetTenure(double[] p) GetOccupation(double[] p) GetIndustry(double[] p) GetEducation1(double p) GetEducation2(double p[]) GetHours(double[] p) GetFormerIndustry(double[] p)

SimLeeds1: using conditional probabilities from the Small Area Statistics n Baseline population SAS table 39 n p(age, sex, employment status) table 8 n p(sex, ethnicity) Monte Carlo sampling from the above probabilities

SimLeeds1 - The Algorithm n Step 1 - Read the baseline population and the probability files (generated from the SAS) n Step 2 - Create an array of household objects based on the baseline population data n Step 3 - Assign attributes to each household on the basis of random (Monte Carlo) sampling, using the conditional probability distributions derived from the Census n Step 4 - Output the list of microsimulated households

SimLeeds1 - outputs

SimLeeds2: Using probabilities from the Samples of Anonymised Records (SARs) n Baseline population - SAS table 39 n p(age, sex, employment status, ethnicity etc,) conditional upon the attributes of table 39 (obtained from the SARs). Monte Carlo sampling from the above probabilities

SimLeeds3 - Monte Carlo sampling from SAS generated conditional probabilities with the use of Iterative Proportional Fitting (IPF) technique The IPF procedure can be seen as a method to adjust a two- dimensional matrix iteratively until the row sums and column sums equal some predefined values (Birkin, 1987; Wong, 1992). IPF can also be defined as a mathematical scaling procedure, which ensures that a two-dimensional table of data is adjusted so that its row and column totals agree with row and column totals from alternative sources (Norman, 1999) or it can be seen, from a geographical viewpoint, as a procedure for generating disaggregated spatial data from spatially aggregated data (Wong, 1992).

SimLeeds1 - Total Absolute Error (TAE)

SimLeeds2 - Total Absolute Error (TAE)

SimLeeds3 - Total Absolute Error (TAE)

SimLeeds4: Monte Carlo sampling from SARs and SAS generated conditional probabilities with the use of IPF

SimLeeds4 - The Algorithm Step 1 - Get conditional probabilities from the SARs and the SAS related to the household attributes that need to be estimated Step 2 - Apply the Iterative Proportional Fitting technique to the above probabilities in order to generate a matrix of micro-probability distribution. Step 3 - Generate a list of Household objects ('instantiate' the Household class) Step 4 - Read the above micro-probability matrix and assign type to each Household on the basis of random sampling Step 5 - Export the list of the estimated Households with their associated attributes. Step 6 - Export a table with the counts of Household of each different category, which can be easily integrated with spatial boundary data in a GIS.

SimLeeds4 - outputs

SimLeeds4 - Total Absolute Error

SimLeeds5 - Reweighting the SARs Select tables to describe small area Select household records from household SAR and create working tables Compare each working table to each target table. Calculate error (sum of absolute error) Replace one of the households in the selection with a new one. Calculate change in error (  e) If (  e < 0) accept change if (exp(-  e/t) > random number (0-1)) accept else reject change if error = 0 exit if iteration count reached reduce temperature else repeat Set “Temperature” The method used to create the population.

SimLeeds5 -The Algorithm n Step 1 - read in SAS tables (raw counts not probabilities). n Step 2 - read in SAR database in to household records. n Step 3 - select sufficient households at random to populate the tables. n Step 4 - apply simulated annealing to find best fitting set of households. n Step 5 - when error = 0 or iteration count exceeded write out best set of records.

SimLeeds5 - outputs

SimLeeds5 - Total Absolute Error

Conclusions n More thorough tests needed n The SimLeeds models need to be applied for more EDs and more attributes n Dynamic microsimulation modeling n Urban and Regional policy evaluation Paper available on-line at: /