Presentation is loading. Please wait.

Presentation is loading. Please wait.

Alternative Futures – ASAP Research Cluster Seminar 16 th November 2005 MoSeS Starts for the Promised Land Andy Turner Outline –Introduction –Population.

Similar presentations


Presentation on theme: "Alternative Futures – ASAP Research Cluster Seminar 16 th November 2005 MoSeS Starts for the Promised Land Andy Turner Outline –Introduction –Population."— Presentation transcript:

1 Alternative Futures – ASAP Research Cluster Seminar 16 th November 2005 MoSeS Starts for the Promised Land Andy Turner Outline –Introduction –Population Modelling Progress –Next Steps –Feedback

2 Alternative Futures – ASAP Research Cluster Seminar 16 th November 2005 Introduction

3 Alternative Futures – ASAP Research Cluster Seminar 16 th November 2005 A religious story? Lost in the Desert? Our heading? –The Promised Land –SIM-UK –GeoSIM

4 Alternative Futures – ASAP Research Cluster Seminar 16 th November 2005 Modelling and Simulation for e-Social Science ( MoSeS ) Mark Birkin, Martin Clarke, Phil Rees, Andy Turner, Belinda Wu (School of Geography) Haibo Chen (Institute for Transport Studies) Justin Keen (Institute for Health Sciences) John Hodrien, Paul Townend, Jie Xu (School of Computing)

5 Alternative Futures – ASAP Research Cluster Seminar 16 th November 2005 MoSeS is a Node of the National Centre for e-Social Science http://www.ncess.ac.uk NCeSS aims to investigate, promote and support the use of eScience in social science research

6 Alternative Futures – ASAP Research Cluster Seminar 16 th November 2005 eScience Based on Grid Computing and collaboration What is Grid Computing? –Many definitions… –A move towards ubiquitous computing –A service/protocol for sharing Information Technology (IT) resource over the Internet Computer scientists are building the next generation of computational infrastructure –‘[The Grid] intends to make access to computing power, scientific data repositories and experimental facilities as easy as the Web makes access to information.’ (Tony Blair, 2002)

7 Alternative Futures – ASAP Research Cluster Seminar 16 th November 2005 eScience Grid Computing Environments and The Grid –Enhance capabilities for IT resource sharing for research –Is about providing easy and secure access to massive computational resources, software and data promoting collaborative working of virtual organisations e-Social Science is eScience targeted and geared for applications more specific to social science including a major part of geography

8 Alternative Futures – ASAP Research Cluster Seminar 16 th November 2005 MoSeS Aims and Objectives Raise awareness of eScience and eResearch Develop practical geographical e-Social Science applications demonstrating the potential of Grid Computing Model the UK human population at individual and higher organisational levels –households, communities, regions –disparate and/or geographically diffuse organisations and society –service orientated government Develop and package a suit of modelling tools which allows specific research and policy questions to be addressed with demonstrator applications for: –Health –Business –Transport

9 Alternative Futures – ASAP Research Cluster Seminar 16 th November 2005 MoSeS Initial Tasks Develop methods to generate individual human population data for the UK from 2001 UK human population census data Develop a Toy Model –Dynamic agent based microsimulation modelling toolkit and apply it to simulate change in the UK Develop applications for –Health –Business –Transport

10 Alternative Futures – ASAP Research Cluster Seminar 16 th November 2005 MoSeS Challenges Grid enabling the data and tools Visualisation –Google Earth –Computer Games Collaboration Retaining a problem focus Design and Development

11 Alternative Futures – ASAP Research Cluster Seminar 16 th November 2005 MoSeS Current Parallel Developments Belinda Wu is working on the applications beginning with a Toy Model for Leeds Paul Townend is working on Grid Enabling Andy Turner is focussing on the population modelling The MoSeS team are meeting regularly and plan a launch some time next year when we hope to have something impressive to show off to NCeSS colleagues and invited guests from the eScience community, government and business

12 Alternative Futures – ASAP Research Cluster Seminar 16 th November 2005 MoSeS Human Population Model Current focus on the contemporary situation looking forwards over the next 25 years Primarily data wanted for individuals grouped into households Need to develop a method to synthesise and enrich data since available census and social survey data is not sufficient in coverage and detail A method was outlined in the proposal –This is being implemented and results are being tested

13 Alternative Futures – ASAP Research Cluster Seminar 16 th November 2005 Population Modelling Method To select a fitting set of individual records from the 2001 UK Population Census 3% Individual Sample of Anonymised Records (ISAR) to represent the individuals for regions given by 2001 UK Population Census Area Statistics (CAS) Initial focus is for regions called Output Areas –Smallest Census Output Areas –Typically about 300 people, 100 households Begin with Leeds and scale up to the UK

14 Alternative Futures – ASAP Research Cluster Seminar 16 th November 2005 Combination Given the population (p) of an Output Area we want to select a sub-sample of this size from the n = 1843525 records in the ISAR The general formula for finding the number of permutations of size p taken from n objects npPermutations is: Approximately n p

15 Alternative Futures – ASAP Research Cluster Seminar 16 th November 2005 Computation Number of potential solutions too great to find the best fitting solution by a brute force search? –Probably, yes, even using all the computational power of The Grid –Interestingly the number of potential solutions is even greater for larger regions than Output Areas (although there are less of them) Fortunately we are only interested in specific types of solution and can constrain our search For some criteria hard constraints are appropriate and for other variables optimisation is the key within these constraints

16 Alternative Futures – ASAP Research Cluster Seminar 16 th November 2005 Constraints What can we constrain to? –There are limits The more detailed the constraint criteria the less likely it can be met –The ISAR is only a 3% sample –Specific CAS tabulations The aggregations of variables are bespoke Beware of errors especially systematically introduced disclosure control measures –Census data are estimates and contain unknown level of error What is most important to ensure is right? –Age/Gender profile –Number of Household Reference People –Household Composition –Social Class –Health status etc…

17 Alternative Futures – ASAP Research Cluster Seminar 16 th November 2005 Getting to Grips with ISAR and CAS data 2001 UK Census data is unusual (like most census data) –Details are lost by aggregation and accuracy is deliberately worsened via the application of disclosure control measures –This is done for confidentiality reasons and as users we are forced to appreciate this –On the one hand this generates jobs, on the other hand, it renders census data almost useless for supporting certain applications Details on UK Census data including ISAR and CAS are available via –http://www.statistics.gov.uk/census/ Usefully 2001 CAS tables that do not currently exist can be commissioned There is an application procedure for gaining access to Controlled Access Microdata Sample (CAMS) records from the 2001 Census –The data is supposedly better –It will be hard for us to use due to the way it is controlled

18 Alternative Futures – ASAP Research Cluster Seminar 16 th November 2005 CAS Key Statistics Tables –31 tabulations –E.G KS001 Usually Resident Population 6 cells Standard offerings –53 cross tabulations –E.g. CS001 Age/Sex/Resident Type 250 cells Themed Tables –6 cross tabulations –E.g. CT001 Theme Table On All Dependent Children 348 cells Univariate Tables –43 tabulations –E.g. UV003 Sex 3 cells

19 Alternative Futures – ASAP Research Cluster Seminar 16 th November 2005 Constraint and Optimisation using Key Statistics As a first step we have constrained by age and ensured that we have the correct number of household reference people –Makes it easier to construct households for Toy Model Our fitness function is a simple Sum of Squared Errors (SSE) for a number of aggregate variables –Measure of the difference between aggregate counts from the ISAR records and the published and aggregated CAS Key Statistics Initial focus on health and household composition

20 Alternative Futures – ASAP Research Cluster Seminar 16 th November 2005 Optimisation Variables Health variables –peopleWhoseGeneralHealthWasGood –peopleWhoseGeneralHealthWasFairlyGood –peopleWhoseGeneralHealthWasNotGood –peopleWithLimitingLongTermIllness –peopleWithoutLimitingLongTermIllness (Derived) Houshold Composition variables –oneFamilyAndNoChildren (Derived) –marriedOrCohabitingCoupleWithChildren (Derived) –loneParentHouseholdsWithChildren (Derived) (Derived) means calculated from other variables

21 Alternative Futures – ASAP Research Cluster Seminar 16 th November 2005 Optimisation and Goodness of Fit Initially for each Output Area in Leeds we generated 10000 possibly different solutions and picked the best one Now we are using a genetic algorithm to assist in finding a better solution –More strategic –Constraints form genes –Effectively each genetic bit string is an ordered boolean array for the ISAR AGE0 and HRP order Currently genetic algorithm works by breeding and mutation and survival of the fittest

22 Alternative Futures – ASAP Research Cluster Seminar 16 th November 2005 Next Steps 1 Constraints –Additional constraint by gender Should improve household formation Need to use Standard CAS cross tabulations –Problems due to confidentiality »Perhaps need to consider larger regions than Output Areas –Beginning investigating what other constraints are possible Leeds UK –Identify problem Output Areas Optimisation –Use more optimisation variables –Experiment with the genetic algortihm

23 Alternative Futures – ASAP Research Cluster Seminar 16 th November 2005 Next Steps 2 Testing –Examine results Mapping –Optimised variables –Exogenous variables Grid Enabling –Data –Provenance Toy Model Publication

24 Alternative Futures – ASAP Research Cluster Seminar 16 th November 2005 MoSeS Recap We are developing a dynamic geographic microsimulation of the UK –A model comprising of individual people that occupy the UK environment and move about it through time interacting in numerous ways –Each individual will have family, household and social networks and reasonably complex characteristics and behaviour –The idea is to build a platform for simulating change in the UK for ASAP

25 Alternative Futures – ASAP Research Cluster Seminar 16 th November 2005 Thank you ! Any feedback or questions? Please email –A.G.D.Turner@leeds.ac.uk http://www.ncess.ac.uk

26 Alternative Futures – ASAP Research Cluster Seminar 16 th November 2005 Acknowledgements Thanks to all involved in the production of the maps that I grabbed off the internet for the start of this presentation


Download ppt "Alternative Futures – ASAP Research Cluster Seminar 16 th November 2005 MoSeS Starts for the Promised Land Andy Turner Outline –Introduction –Population."

Similar presentations


Ads by Google