DIMACS European demographic and movement data for modelling Steve Leach, Phil Sansom, Iain Barrass, Ian Hall Microbial Risk Assessment Centre for Emergency.

1 DIMACS European demographic and movement data for modelling Steve Leach, Phil Sansom, Iain Barrass, Ian Hall Microbial Risk Assessment Centre for Emergency Preparedness and Response Health Protection Agency, UK October 2008

2 Spatially Structured Models? Not always required/useful? Depends on the question(s) Applications – Better understanding spatial aspects of the dynamics of some diseases Planning realistic interventions – spatial targeting of control strategies Enhanced data – Numerical (demographic, resident populations, working populations, population movements) Geographic data – analysis and visualisation purposes (boundaries). Scale & Extent – ? Depends on the question, the scenario & the infectious agent’s epidemiology

3 Example National Extent - Smallpox Speed/Extent of National Spread? How might we best control an outbreak at National level? Dependencies? Numbers initially infected Transmissibility Efficacy of case finding & contact tracing Safety (Deaths) of vaccine Mass Vaccination  Middle Ground  Targeted Controls? Requires a spatially explicit model  a more local model reasonable choice Extent = UK?

4 Motivation – Historical Outbreak Data: Smallpox Kerrod E, Geddes AM, Regan M, Leach S (2005) Surveillance and control measures during smallpox outbreaks. Emerg Infect Dis. 11:291-7. Liverpool 1902/3 Edinburgh 1942

5 Hall IM, Egan JR, Barrass I, Gani R, Leach S (2007) Comparison of smallpox outbreak control strategies using a spatial metapopulation model. Epidemiol Infect. 135:1133-44. Modelling Impacts of Different Intervention Strategies - Some Spatially Localised Modelling - Metapopulation Models Smallpox - Modelling and Tracking Potential Epidemics Over Space and Time Beowulf Cluster Within Patch dynamics Between Patch dynamics

6 Spatial spread Spread of disease away from seed in London

7 Smallpox outbreak Infected districts three weeks after a covert release of smallpox in London

8 EoItriggQdelay* Vaccine Fatality Rate 012510 0----- 02377 5001114 100003999 1003555 50011212 250001199 100127,88 50011124 10000000112 10001112 5000111 With low numbers of index cases, district mass vaccination holds as an optimal strategy with increasing values of Itrigg. Baseline interventions are only optimal with low numbers of index cases and when interventions are implemented immediately; unlikely unless the release is overt. Nationwide mass vaccination is only optimal for large numbers of index cases and conservative assumptions regarding the vaccine fatality rate. Some Results

9 Good National Data - UK Reliable Coherent Base demographics Regular population movements Data Requirements

10 Census + Surveys Describe England, Wales, Scotland according to 2001 Administration Regions Counties and UAs 142 (67 in 1991) Districts 434 (459 in 1991) Wards 10420

11 Contemporary population movement data - Commuting

12 Still To Be Integrated Systematically “More Random” population movements  National Surveys Tourism Shopping Business Pleasure Etc. Data Requirements

13 Population Movements – Extent Individual European Countries Origin–Destination data linking homes and workplaces have been identified for: Denmark France United Kingdom Other data sets may exist for other countries.

14 International/European Models But when planning for the course of a disease in one country or region, it may be necessary to consider the impacts both from and to the surrounding area. However, it may not be necessary to model the entire region in the same level of detail as the area of primary interest. Nevertheless some international data at some spatial resolution is going to be required

15 Example – Pandemic Influenza International Spread  European Perspective, National Importation Scenarios (Surveillance), Geographic Spread, Local epidemics vs. National Epidemic International Air Transport Data International Passenger Survey District based UK model

16 Experience of Setting up a Data Warehouse Data from many sources collected and integrated for modelling through projects MODELREL, INFTRANS & FLUMODCONT. sub-country, country, European international extents Some of this data is free and can be easily shared. Some data is relatively expensive.

17 Challenges – Spatial Structured Models One basic data requirement  the resident population of each geographic region to be modelled. The geographic regions for which population counts are most often recorded are geo-political regions (e.g. administrative regions), these regions pose a variety of challenges: - Several different and non-nested, geographies - Irregular Shapes - Uneven distribution of population

18 European Data Eurostat – The NUTS System Eurostat established the NUTS system in order to provide a single uniform breakdown of territorial units for the production of regional statistics for the European Union. NUTS is a hierarchical system which seeks to equalise the average population of regions within a country at any given level. LevelMinimumMaximum NUTS 13 million7 million NUTS 2800,0003 million NUTS 3150,00800,000

19 European Level Census Data Detailed demographic data is available from Eurostat at NUTS 3 level, dividing the EEA into nearly 1,500 regions, key statistics include: Population by sex, 5 year age group, marital & cohabitational status Active population by sex, 5 year age group and status Households by sex, 5 year age group, status, type & size Data are mostly complete for the EU but some gaps remain. Data are free to use.

20 More Detailed European Census Data Eurostat holds LAU 2 or commune level 2001 Census data normally only available to the Commission. This divides the EEA in to over 110,000 regions. Key statistics include: Population by sex & 10-15 year age groups Active population by age group Residential housing by status Data is mostly complete for the EU but: some gaps remain not always coherent with boundary data not necessarily freely available.

21 European Census Data – Administrative Boundaries Digital administrative boundaries compatible with NUTS 3 data are available free from Eurostat at a scale of 1:3,000,000. Digital administrative boundaries compatible with LAU 2 data are available from EuroGeographics at a scale of 1:100,000 But at a substantial cost

22 European Census Data - CORINE JRC has disaggregated the Eurostat commune level census data over the 100m grid of the CORINE land use dataset. Population counts are exact at commune level Analysis of sub-commune data for the UK reveals inaccuracies Aggregated to 1km or similar, accuracy should exceed comparable datasets Data are complete for EU27. Data are free to use.

23 International “Census” Data –Landscan The LandScan Dataset comprises a worldwide population database compiled on a 30”x30” latitude/ longitude grid. Census counts (at sub-national level) were apportioned to each grid cell based on likelihood coefficients, which are based on proximity to roads, slope, land cover, nighttime lights, and other information. LandScan has been developed as part of the ORNL Global Population Project for estimating ambient populations at risk. But data costs for non-academic groups.

24 International “Census” Data –GRUMP The GPW and GRUMP data sets provide worldwide population databases compiled on a 30”x30” latitude/longitude grid, including: Population Population Density Settlement Points Urban Extents The data is less heavily modelled than LandScan and therefore has lower effective resolution. Data are free to use.

25 International Census Data –other sources World populations at national level may be obtained for free from a number of sources including: CIA World Factbook United Nations No other single source of detailed contemporary socio-demographic data or sub-national populations and compatible geo-data. Possible to obtain sub-national population data for the super–states of China and India, and other larger countries such as the US.

26 National Population Movements other than Commuting Travel surveys conducted in individual countries may yield distributions of: Distances Frequencies Timings for trips disaggregated by: Age Sex Purpose Method of transport

27 Eurostat collects air passenger origin–destination data for Europe including: Domestic and international flights Data by month, quarter or year Data by country, region or airport Some charter flights Data are free to use. European Population Movements

28 International Population Movements The United Nations World Tourism Organization’s Yearbook of Tourism Statistics contains origin–destination data on an annual basis for around 240 countries, including: Arrivals at national borders Arrivals at accommodation establishments Overnight stays in accommodation establishments All modes of transport Negligible Cost

29 International Population Movements The International Civil Aviation Authority’s On-Flight Origin–Destination database includes: Data by city pair Data by quarter or year Only scheduled flights Some Cost.

30 International Population Movements The International Air Transport Association’s On- Flight Origin–Destination database includes: Data by city, country or region Monthly data 70% of scheduled passengers Data may only represent ~50% of passenger traffic Data costs are likely to be high. Comparison across Datasets Country pairs ICAO – 350M trips – 1,100 connections IATA – 400M trips – 1,100 connections WTO – 850M trips – 10,500 connections

31 International Models – Problems & Solutions? No coherent or poor data currently for: Some types of regular and more random cross- border population movements in Europe Regular and more random international population movements worldwide Locally worse(?) for places elsewhere in the world Ignore the Gaps!? Approximations? Gravity-type models Diffusion Combinations Population density Distance Socio-economic factors Length of shared borders Explaining data? Predicting /Estimating Future?

