International Symposium on Grid Computing 2010 Applications on Humanities & Social Sciences I Taipei, Taiwan (2010-03-10) GENESIS Social Simulation Modelling.

Slides:



Advertisements
Similar presentations
Turing Machines January 2003 Part 2:. 2 TM Recap We have seen how an abstract TM can be built to implement any computable algorithm TM has components:
Advertisements

Operating Systems (CSCI2413) Lecture 2 Overview phones off (please)
SICSA student induction day, 2009Slide 1 Multithreading RePast Models Alex Voss 1, Jing-Ya You 2, Eric Yen 2, Simon Lin 2, Ji-Ping Lin 3, Andy Turner 4.
File Processing - Indirect Address Translation MVNC1 Hashing Indirect Address Translation Chapter 11.
E-Social Science: scaling up social scientific investigations Alex Voss, Andy Turner (ESRC National Centre for e-Social Science) Gabor Terstyanszky, Gabor.
Future Research Directions in Agent Based Modelling Workshop, Leeds, UK, Large Scale Social Simulation in Java and the NeISS Project Andy Turner.
Math 1241, Spring 2014 Section 3.1, Part One Introduction to Limits Finding Limits From a Graph One-sided and Infinite Limits.
GENESIS Web 2.0 Agent City Simulation: Establishing a user community and enabling collaborators to manipulate simulations and develop models Andy Turner.
MoSeS meets NEC 10 th March 2008 MoSeSMoSeS Andy Turner
 Monday, 9/30/02, Slide #1 CS106 Introduction to CS1 Monday, 9/30/02  QUESTIONS (on HW02, etc.)??  Today: Libraries, program design  More on Functions!
Oxford eResearch Conference 2008 Paper Session 4A: NCeSS Oxford, UK, ( ) Experience of e-Social Science: A Case of Andy Turner and MoSeS Andy.
The Future of GeoComputation Ian Turton Centre for Computational Geography University of Leeds.
WSN Simulation Template for OMNeT++
Individual and Household Level Estimates Based on 2001 UK Human Population Census Data Andy Turner CSAP Seminar on Microsimulation: Problems and Solutions.
Homework 2 In the docs folder of your Berkeley DB, have a careful look at documentation on how to configure BDB in main memory. In the docs folder of your.
CCG 1 MoSeS Introduction and Progress Report Andy Turner
K.Harrison CERN, 23rd October 2002 HOW TO COMMISSION A NEW CENTRE FOR LHCb PRODUCTION - Overview of LHCb distributed production system - Configuration.
E-Social Science: scaling up social scientific investigations Alex Voss, Andy Turner, Rob Procter National Centre for e-Social Science Gabor Terstyanszky,
Modelling and Simulation for e-Social Science (MoSeS) Mark Birkin, Martin Clarke, Phil Rees, Andy Turner, Belinda Wu (School of Geography) Haibo Chen (Institute.
Topic 1: Introduction to Computers and Programming
An Introduction to Social Simulation Andy Turner Presentation as part of Social Simulation Tutorial at the.
Group #5 1. Mereseini Rokodau 2. Eleanor Siavii 3. Jowella Vaka.
MapReduce : Simplified Data Processing on Large Clusters Hongwei Wang & Sihuizi Jin & Yajing Zhang
CS190/295 Programming in Python for Life Sciences: Lecture 1 Instructor: Xiaohui Xie University of California, Irvine.
Constructing Individual Level Population Data for Social Simulation Models Andy Turner Presentation as part.
CHAPTER 4: INTRODUCTION TO COMPUTER ORGANIZATION AND PROGRAMMING DESIGN Lec. Ghader Kurdi.
USING HADOOP & HBASE TO BUILD CONTENT RELEVANCE & PERSONALIZATION Tools to build your big data application Ameya Kanitkar.
SICSA student induction day, 2009Slide 1 Social Simulation Tutorial Session 6: Introduction to grids and cloud computing International Symposium on Grid.
Computers in the real world Objectives Understand what is meant by memory Difference between RAM and ROM Look at how memory affects the performance of.
A High-Throughput Computational Approach to Environmental Health Study Based on CyberGIS Xun Shi 1, Anand Padmanabhan 2, and Shaowen Wang 2 1 Department.
Süleyman Fatih GİRİŞ CONTENT 1. Introduction 2. Programming Model 2.1 Example 2.2 More Examples 3. Implementation 3.1 ExecutionOverview 3.2.
Test Of Distributed Data Quality Monitoring Of CMS Tracker Dataset H->ZZ->2e2mu with PileUp - 10,000 events ( ~ 50,000 hits for events) The monitoring.
MapReduce: Simplified Data Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat.
Introduction to Hadoop and HDFS
Design and Programming Chapter 7 Applied Software Project Management, Stellman & Greene See also:
Geodemographic modelling collaboration Alex Voss, Andy Turner Presentation to Academia Sinica Centre for Survey Research
Providing Access to Census- based Interaction Data in the UK: That’s WICID! John Stillwell School of Geography, University of Leeds Leeds, LS2 9JT, United.
Mary Beth Kurz, PhD Assistant Professor Department of Industrial Engineering Clemson University Utilizing Condor to Support Genetic Algorithm Design Research.
Wenjing Wu Computer Center, Institute of High Energy Physics Chinese Academy of Sciences, Beijing BOINC workshop 2013.
Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 3: Operating-System Structures System Components Operating System Services.
Jack DeWeese Computer Systems Research Lab. Purpose  Originally intended to create my own simulation with easily modified variables  Halfway through.
Turning science problems into HTC jobs Wednesday, July 29, 2011 Zach Miller Condor Team University of Wisconsin-Madison.
2-Day Introduction to Agent-Based Modelling Day 2: Session 6 Mutual adaption.
Web Access to Census Interaction Data John Stillwell and Oliver Duke-Williams Centre for Computational Geography University of Leeds, Leeds LS2 9JT Paper.
1 ◄ ◄ Maternal and Infant Health data for California Choose one vital records indicator:  Preterm birth (birth prior to 37 weeks of pregnancy among singletons)
SICSA student induction day, 2009Slide 1 Social Simulation Tutorial International Symposium on Grid Computing Taipei, Taiwan, 7 th March 2010.
March 23 & 28, Hashing. 2 What is Hashing? A Hash function is a function h(K) which transforms a key K into an address. Hashing is like indexing.
Memory management.
By Jeff Dean & Sanjay Ghemawat Google Inc. OSDI 2004 Presented by : Mohit Deopujari.
Big traffic data processing framework for intelligent monitoring and recording systems 學生 : 賴弘偉 教授 : 許毅然 作者 : Yingjie Xia a, JinlongChen a,b,n, XindaiLu.
Random numbers in C++ Nobody knows what’s next....
Introduction to Information Retrieval Introduction to Information Retrieval Lecture 4: Index Construction Related to Chapter 4:
Urban Traffic Simulated From A Dual Perspective Hu Mao-Bin University of Science and Technology of China Hefei, P.R. China
Susanna Guatelli Geant4 in a Distributed Computing Environment S. Guatelli 1, P. Mendez Lorenzo 2, J. Moscicki 2, M.G. Pia 1 1. INFN Genova, Italy, 2.
 A computer is an electronic device that receives data (input), processes data, stores data, and produces a result (output).  It performs only three.
MapReduce: Simplied Data Processing on Large Clusters Written By: Jeffrey Dean and Sanjay Ghemawat Presented By: Manoher Shatha & Naveen Kumar Ratkal.
Working Efficiently with Large SAS® Datasets Vishal Jain Senior Programmer.
A Web Based Job Submission System for a Physics Computing Cluster David Jones IOP Particle Physics 2004 Birmingham 1.
Java Programming: From the Ground Up
Computer Science 2 What’s this course all about?
CS190/295 Programming in Python for Life Sciences: Lecture 1
Data Structures and Algorithms
Processor Management Damian Gordon.
Developing a daily time step individual level demographic simulation model Andy Turner Presentation slides prepared for a CSAP research cluster meeting,
Processor Management Damian Gordon.
MapReduce: Simplified Data Processing on Large Clusters
Chapter 1: Creating a Program.
Presentation transcript:

International Symposium on Grid Computing 2010 Applications on Humanities & Social Sciences I Taipei, Taiwan ( ) GENESIS Social Simulation Modelling Progress Andy Turner

Overview Introduction Geographic Dynamic Simulation Models –A demographic model –Developing a traffic model Scaling Up to City Size Models –Java OutOfMemoryError Handling Considerations and Further Work

Introduction Andy Turner – –Blogger –GENESIS GENESIS –Generative e-Social Science for Socio-Spatial Simulation –a second phase research node of the UK National Centre for e-Social Science –

My Paper, Presentation Slides, and Notes from this meeting can all be found via the following URL:

Geographic Dynamic Simulation Models Geographic means it is about interaction on or near the surface of earth –It is about people and their environment Dynamic means this represents change over time –Continuous change may be modelled In an event based or scheduled way –Where each event triggers other events With a specific temporal resolution A simulation is the run of a model –Based on input data and configuration The model is the simplification of reality

Two Agent Based GENESIS DSM 1.Demographic model –Run with time steps of a day and run for years –Initially aspatial 2.Traffic model –Run with time steps of seconds and run for days –Inherently spatial

Result can be reproduced although models are stochastic in nature Results are seeded by a psuedo-random number generator –Results can be easily reproduced provided the same input data and model configuration –A range of results can be generated for different random seeds Iterators are used to go through collections of Objects during processing –However, the order in which objects are retrieved via the iterator does not have an effect in that the data after going through the iteration is the same each time. MoSeS meets NEC 10 th March 2008

Basic Demographic Model Deals with birth and death Starts with an initial population –Comprised of males and females Males and females may have different age specific fertility and mortality rates At each step I simulate: –Death –Birth –Pregnancy –Miscarriage

Basic Demographic Model Detail All living Person (Agents) are tested to see if they die at each step: –Tests are done by asking for the next number from a pseudo-random sequence and comparing the number with respective age and gender specific mortality rates In the first simple model –Miscarriage rate was fixed for all ages, but was reasonably high –Gestation period was fixed at 266 days –At birth a single the Agent is formed and there is a 50% chance of it being either male or female

Example Output _Year 299 –_TotalDeathsInYear –_TotalBirthsInYear –_TotalConceptionsInYear –_TotalMiscarriagesInYear 2138 Output Directory

Demographic considerations How realistic do the simulations results become? To begin with the initial population is not seeded with pregnancies –There are a large number of Persons added into the simulation 266 days after it starts –Over time miscarriage helps to even out the number of new births on each day Sharply increasing Fertility probabilities at a specific age result in a secondary cohort effect –The first newborn population tend to all have babies on the same day too

Example Platform Intel(R) Core(TM) 2 Duo CPU 2.4GHz 2.39 GHz, 1.95 GB RAM Ran from within Netbeans Java opts ––Xmx512m –Xms512m

Example Run Simulation Parameters _MaximumNumberOfAgents = _MaximumNumberOfAgentsPerAgentCollection = InitialFemalePopulation = 1000 InitialMalePopulation = 1000 TotalYears 300

Example Run Data No input data All data generated by simulation model – Files Directories Size 1.92 GB ( bytes) Size on disk 4.48 GB ( bytes)

Developing a Traffic Model To model peoples movements on an individual level requires a way to store the location of each agent As a first step, agents were positioned in a confined region on a Euclidean 2D plane and made to move around this randomly by repositioning at each time tick

Further developing a Traffic Model Next the concept of a destinations was developed –Rather than necessarily having a different destination at each time tick, an Agent might be assigned a destination beyond its maximum range for movement in a time tick Scheduling Agents movements Collecting and using other data

A model for Leeds seeded with restricted UK Census data Focus on commuting –Journeys from home to work and back home again –Data 2001 UK census special travel statistics data which gives the home origin and work destination at an Output Area (OA) level –There are over 200 thousand OAs in the UK Uses Open Street Map (OSM) Road data – –3 rd party libraries Traveling Salesman OSM routing library – GeoTools –

Example output Output Directory

Memory Handling This is the crux of the paper I am presenting here I use memory handling to scale my model so I can run with billions of agents on an average machine Warning, the next few slides show Java code and I shall try to explain them so you can all understand

A method to get the amount of available JVM memory

A method to try to prevent OutOfMemoryError being thrown

A typical public method

A simple case of handling OutOfMemoryError

A method for initialising a memory reserve to free up if an OutOfMemoryError is encountered

Methods for creating and freeing up a memory reserve

A method for swapping data

Agents and Data Structures Agents –Each of these has a unique numeric ID –Have a file location AgentCollections –These are HashSets of Agents –Have a file location IDCollections –These are HashSets of Agent numeric IDs

Data Structure Settings directory levels can store (one hundred thousand million) files.

Considerations and Further Work I am happy to help anyone wanting to get my code running on their laptops while I am here –There is no set up documentation at the moment and not all what is needed is in the subversion repository, but it is all online and open source. Meetings with colleagues at ASGC and ASCSR...

More to do... There is no end to the task of detailing a digital version of what has happened and what might happen let alone considering post casting what might have happened –Social simulation is fun, geographical, so please get involved if you are interested and we can work together on this

Acknowledgements GENESIS is funded by the UK ESRC –RES Thanks for help and support from: –The international e-Research community –The University of Leeds School of Geography Centre for Computational Geography

Thank you MoSeS meets NEC 10 th March 2008