Network Dynamics and Simulation Science Laboratory A Data-driven Epidemiological Model Stephen Eubank, Christopher Barrett, Madhav V. Marathe GIACS Conference.

Slides:



Advertisements
Similar presentations
AMUSE Autonomic Management of Ubiquitous Systems for e-Health Prof. J. Sventek University of Glasgow In collaboration.
Advertisements

Findings Department of Health and Human Services National Institutes of Health National Institute of General Medical Sciences Social Studies Physicist.
Mobile Communication Networks Vahid Mirjalili Department of Mechanical Engineering Department of Biochemistry & Molecular Biology.
Agent-based Modeling: A Brief Introduction Louis J. Gross The Institute for Environmental Modeling Departments of Ecology and Evolutionary Biology and.
SA-1 Probabilistic Robotics Planning and Control: Partially Observable Markov Decision Processes.
In biology – Dynamics of Malaria Spread Background –Malaria is a tropical infections disease which menaces more people in the world than any other disease.
The Importance of Detail: Sensitivity of Household Secondary Attack Rate and Intervention Efficacy to Household Contact Structure A. Marathe, B. Lewis,
Modeling the Ebola Outbreak in West Africa, 2014 August 11 th Update Bryan Lewis PhD, MPH Caitlin Rivers MPH, Stephen.
Population dynamics of infectious diseases Arjan Stegeman.
University of Buffalo The State University of New York Spatiotemporal Data Mining on Networks Taehyong Kim Computer Science and Engineering State University.
GIS and Spatial Statistics: Methods and Applications in Public Health
Presentation Topic : Modeling Human Vaccinating Behaviors On a Disease Diffusion Network PhD Student : Shang XIA Supervisor : Prof. Jiming LIU Department.
National Infrastructure Simulation & Analysis Center NISAC PUBLIC HEALTH SECTOR: Disease Outbreak Consequence Management Stephen Eubank Los Alamos National.
1 Epidemic Spreading in Real Networks: an Eigenvalue Viewpoint Yang Wang Deepayan Chakrabarti Chenxi Wang Christos Faloutsos.
Beginning the Research Design
Statistics & Modeling By Yan Gao. Terms of measured data Terms used in describing data –For example: “mean of a dataset” –An objectively measurable quantity.
Simulation Science Laboratory Modeling Disease Transmission Across Social Networks DIMACS seminar February 7, 2005 Stephen Eubank Virginia Bioinformatics.
1 Validation and Verification of Simulation Models.
Modeling the SARS epidemic in Hong Kong Dr. Liu Hongjie, Prof. Wong Tze Wai Department of Community & Family Medicine The Chinese University of Hong Kong.
Sunbelt 2009statnet Development Team ERGM introduction 1 Exponential Random Graph Models Statnet Development Team Mark Handcock (UW) Martina.
GreenSTEP Statewide Transportation Greenhouse Gas Model Cutting Carbs Conference December 3, 2008 Brian Gregor ODOT Transportation Planning Analysis Unit.
CHAPTER 7, the logic of sampling
Introduction to Molecular Epidemiology Jan Dorman, PhD University of Pittsburgh School of Nursing
Comparison of Private vs. Public Interventions for Controlling Influenza Epidemics Joint work with Chris Barrett, Jiangzhuo Chen, Stephen Eubank, Bryan.
PEPA is based at the IFS and CEMMAP © Institute for Fiscal Studies Identifying social effects from policy experiments Arun Advani (UCL & IFS) and Bansi.
How does mass immunisation affect disease incidence? Niels G Becker (with help from Peter Caley ) National Centre for Epidemiology and Population Health.
Model Task Force Meeting November 29, 2007 Activity-based Modeling from an Academic Perspective Transportation Research Center (TRC) Dept. of Civil & Coastal.
IE 594 : Research Methodology – Discrete Event Simulation David S. Kim Spring 2009.
Constructing Individual Level Population Data for Social Simulation Models Andy Turner Presentation as part.
Large-scale organization of metabolic networks Jeong et al. CS 466 Saurabh Sinha.
Social Networking and On-Line Communities: Classification and Research Trends Maria Ioannidou, Eugenia Raptotasiou, Ioannis Anagnostopoulos.
RESEARCH A systematic quest for undiscovered truth A way of thinking
Sampling. Concerns 1)Representativeness of the Sample: Does the sample accurately portray the population from which it is drawn 2)Time and Change: Was.
Computational Methods for Testing Adequacy and Quality of Massive Synthetic Proximity Social Networks Huadong Xia, Christopher Barrett, Jiangzhuo Chen,
V5 Epidemics on networks
1 POPULATION PROJECTIONS Session 8 - Projections for sub- national and sectoral populations Ben Jarabi Population Studies & Research Institute University.
Comparing Effectiveness of Top- Down and Bottom-Up Strategies in Containing Influenza Achla Marathe, Bryan Lewis, Christopher Barrett, Jiangzhuo Chen,
A Data Intensive High Performance Simulation & Visualization Framework for Disease Surveillance Arif Ghafoor, David Ebert, Madiha Sahar Ross Maciejewski,
Report on Intrusion Detection and Data Fusion By Ganesh Godavari.
Geo597 Geostatistics Ch9 Random Function Models.
EpiFast: A Fast Algorithm for Large Scale Realistic Epidemic Simulations on Distributed Memory Systems Keith R. Bisset, Jiangzhuo Chen, Xizhou Feng, V.S.
Routing In Socially Selfish Delay Tolerant Networks Chan-Myung Kim
Comments: The Big Picture for Small Areas Alan M. Zaslavsky Harvard Medical School.
Aemen Lodhi (Georgia Tech) Amogh Dhamdhere (CAIDA)
DTC Quantitative Methods Survey Research Design/Sampling (Mostly a hangover from Week 1…) Thursday 17 th January 2013.
Introduction to Spatial Microsimulation Dr Kirk Harland.
Evaluating Transportation Impacts of Forecast Demographic Scenarios Using Population Synthesis and Data Simulation Joshua Auld Kouros Mohammadian Taha.
Showcase /06/2005 Towards Computational Epidemiology Using Stochastic Cellular Automata in Modeling Spread of Diseases Sangeeta Venkatachalam, Armin.
A Stochastic Model of Paratuberculosis Infection In Scottish Dairy Cattle I.J.McKendrick 1, J.C.Wood 1, M.R.Hutchings 2, A.Greig 2 1. Biomathematics &
Network Dynamics and Simulation Science Laboratory Cyberinfrastructure for Social Networks and Network Analysis Network Dynamics and Simulation Science.
Modeling for Science and Public Health, Part 2 NAGMS Council January 25, 2013 Stephen Eubank Virginia Bioinformatics Institute Virginia Tech.
Influenza epidemic spread simulation for Poland – A large scale, individual based model study.
Coevolution of Epidemics, Social Networks, and Individual Behavior: A Case Study Joint work with Achla Marathe, and Madhav Marathe Jiangzhuo Chen Network.
United Nations Workshop on Revision 3 of Principles and Recommendations for Population and Housing Censuses and Evaluation of Census Data, Amman 19 – 23.
1 Epidemic Potential in Human Sexual Networks: Connectivity and The Development of STD Cores.
Exploring Microsimulation Methodologies for the Estimation of Household Attributes Dimitris Ballas, Graham Clarke, and Ian Turton School of Geography University.
Data Mining and Decision Support
Optimal Interventions in Infectious Disease Epidemics: A Simulation Methodology Jiangzhuo Chen Network Dynamics & Simulation Science Laboratory INFORMS.
Why use landscape models?  Models allow us to generate and test hypotheses on systems Collect data, construct model based on assumptions, observe behavior.
Comparison of Individual Behavioral Interventions and Public Mitigation Strategies for Containing Influenza Epidemic Joint work with Chris Barrett, Stephen.
Efficient Implementation of Complex Interventions in Large Scale Epidemic Simulations Network Dynamics & Simulation Science Laboratory Jiangzhuo Chen Joint.
Network Dynamics and Simulation Science Laboratory Structural Analysis of Electrical Networks Jiangzhuo Chen Joint work with Karla Atkins, V. S. Anil Kumar,
The accuracy of averages We learned how to make inference from the sample to the population: Counting the percentages. Here we begin to learn how to make.
1 Preparedness for an Emerging Infection Niels G Becker National Centre for Epidemiology and Population Health Australian National University This presentation.
Building Valid, Credible & Appropriately Detailed Simulation Models
Exposure Prediction and Measurement Error in Air Pollution and Health Studies Lianne Sheppard Adam A. Szpiro, Sun-Young Kim University of Washington CMAS.
Traffic Simulation L2 – Introduction to simulation Ing. Ondřej Přibyl, Ph.D.
Sangeeta Venkatachalam, Armin R. Mikler
Epidemiological Modeling to Guide Efficacy Study Design Evaluating Vaccines to Prevent Emerging Diseases An Vandebosch, PhD Joint Statistical meetings,
Susceptible, Infected, Recovered: the SIR Model of an Epidemic
Presentation transcript:

Network Dynamics and Simulation Science Laboratory A Data-driven Epidemiological Model Stephen Eubank, Christopher Barrett, Madhav V. Marathe GIACS Conference on "Data in Complex Systems" Palermo, Italy, April

Network Dynamics and Simulation Science Laboratory Data driven epidemiological models I.Complex system II.Data driven, individual-based simulation III.Privacy and accuracy issues

Network Dynamics and Simulation Science Laboratory What’s so complex about epidemiology? Consider an “outbreak” among 4 people removed infectious susceptible

Network Dynamics and Simulation Science Laboratory Outbreaks can be represented as Markov processes A given configuration of the system probabilistically transitions into any of several other configurations. Even a small system has many possible configurations.

Network Dynamics and Simulation Science Laboratory Very little data is available to estimate this process Historically, we (partially) observe 1 or 2 Markov chains We want to estimate transition probabilities on every edge

Network Dynamics and Simulation Science Laboratory Aggregation simplifies the model … … at the cost of reduced information content. p(C’ t+1 | C’ t ) is less informative than p(C t+1 | C t ) when C’  C, #S #I

Network Dynamics and Simulation Science Laboratory Other assumptions further simplify the model … … but are unwarranted in social systems, where components are 1.Heterogenous (distinguishable) 2.Intentional (behavior not determined by physical laws)

Network Dynamics and Simulation Science Laboratory Aggregation naturally makes contact with observations Observations of outbreaks often ignore heterogeneity and intention, and provide only point estimates.

“An approximate answer to the right problem is worth a good deal more than an exact answer to an approximate problem” - J. Tukey “All models are wrong, but some are useful” - G.E.P. Box A system is complex “if its behavior crucially depends on the details of its parts.” - G. Parisi

Network Dynamics and Simulation Science Laboratory Interaction approach simplifies process itself Interactions among system components completely determine transition probabilities among configurations  replaced with

Network Dynamics and Simulation Science Laboratory Calibrating with unexpectedly rich data For aerosol borne pathogens, the probability of transmission is related to physical proximity, duration, etc. The interaction approach reduces to estimating a social network. There is much more data available for this than for outbreaks. But it is not directly observable.

How can we estimate a social network?

Network Dynamics and Simulation Science Laboratory A possible approach we didn’t use Consider a subset of random networks subject to certain constraints Constraints should be relevant to the global dynamics, i.e. epidemics But what are those? A “chicken or the egg” problem: It would seem offhand that a taxonomy of “nets” … would arise naturally from the consideration of the statistical parameters... But the statistical parameters themselves are singled out on the basis of taxonomic considerations, which have yet to be clarified. - Anatol Rapoport and William Horvath, Behav Sci. 1961, 6, 279–291

Network Dynamics and Simulation Science Laboratory Questions to drive model development 1.What is the optimal targeted allocation of antivirals used prophylactically or therapeutically to mitigate influenza pandemic? 2.What combination of targeted antivirals and feasible, community- based, non-pharmaceutical interventions (e.g. closing schools, allowing liberal leave from work) can best delay an outbreak from becoming epidemic for several months? 1 & 2  Models must compare changes in social network with changes in transmissibility This is an example of policy informatics for complex systems

Network Dynamics and Simulation Science Laboratory Interventions specified naturally by effect on network No single “knob” reduces overall transmission by 50%

Network Dynamics and Simulation Science Laboratory Step 1. Create a synthetic population Census data –Individual demographics Age and gender –Household characteristics Size and Income

Network Dynamics and Simulation Science Laboratory Start from a proto-population, e.g. a list of ids. Add observed data Capture correlations in data using statistical models (iterative proportional fitting from Public Use Microdata) Start from a proto-population, e.g. a list of ids. Add observed data Start from a proto-population, e.g. a list of ids. Successive refinement of synthetic data IDGender House hold 1M x 10 8 F1.2 x 10 8

Network Dynamics and Simulation Science Laboratory Step 2. Assign activities, locations & times Locations –Dunn and Bradstreet data Activity surveys –Matched to households by demographics –Matched to locations by activity type & travel time

Network Dynamics and Simulation Science Laboratory Surveys are very different kinds of data sources than census This step depends on data fusion capability Some values may be outcomes of very large games, not statistical models Successive refinement of synthetic data Surveys are very different kinds of data sources than census This step depends on data fusion capability IDGender House hold Activities Activity Locations Activity Times 1M1 School shop :00 3: x 10 8 F1.2 x 10 8 Work social :00 7:30

Network Dynamics and Simulation Science Laboratory So far: a typical family’s day Carpool Home WorkLunchWork Carpool Bus Shopping Car Daycare Car School time Bus

Network Dynamics and Simulation Science Laboratory Overlapping families’ days create a social network

Network Dynamics and Simulation Science Laboratory Successive refinement of synthetic data Gives us a generative model for contacts More powerful than traditional encapsulated agents Note: each byte of data / person adds ~300 MB to the database IDGender House hold Activities Activity Locations Activity Times Contacts Contact Duration 1M1 School shop :00 3:00 2,3,4 836, 289 5:20 0: x 10 8 F1.2 x 10 8 Work social :00 7:30

Network Dynamics and Simulation Science Laboratory Using data for purposes other than intended Possibly the only epidemiological model that has been calibrated using automobile traffic counts! (Because the same activity model generates both transportation demand and contact networks)

Network Dynamics and Simulation Science Laboratory Home Activities adapt to situation & generate network changes

Network Dynamics and Simulation Science Laboratory Derive disease interaction from social network Interactions only need to get a few things right: Susceptibility Infectivity as a function of time since exposure

Network Dynamics and Simulation Science Laboratory Modeling pandemic influenza Nobody knows what pandemic flu will look like Assume something like seasonal flu, but with less immunity Create several “flu” bugs in siico –Moderate (10% attack rate) –Strong ( % attack rate) –Catastrophic (> 50% attack rate) For each, fix other characteristics: –Incubation period: 2-3 days –Infectious period: 2-5 days

Network Dynamics and Simulation Science Laboratory Resolution, fidelity, and accuracy are different Resolution describes level of aggregation, e.g. individuals vs populations Fidelity describes the completeness of the representation’s features, e.g. age vs (age, gender, income, household size, education) Accuracy describes the correctness of features and correlations e.g. is mixing by age derived from social network correct? “Validity” (always for a particular question) depends on all 3.

Effect of changes in social networks (above) on disease dynamics (below)

Network Dynamics and Simulation Science Laboratory Characterizing the resulting network

Degree Distribution, location-location

Degree Distribution, people-people

Sensitivity to parameters

Network Dynamics and Simulation Science Laboratory Assortative Mixing Static people - people projection is assortative –by degree (~0.25) –but not as strongly by age, income, household size, … This is Like other social networks Unlike –technological networks, –Erdos-Renyi random graphs –Barabasi-Albert networks

Removing high degree people useless

Removing high degree locations better

Network Dynamics and Simulation Science Laboratory Summary Complex systems models are hungry for detail (= data) Privacy & extrapolation require “synthetic” data, combining observations (declarative), statistical models, and simulation results (procedural) Validity of synthetic data depends on resolution, fidelity, accuracy, and the question it is intended to answer

Network Dynamics and Simulation Science Laboratory When is this model simpler? Notation: x and y are states of a component at time t and t+1 1.Components’ states are updated independently: # parameters  2.Interactions are pairwise independent: # parameters 

Network Dynamics and Simulation Science Laboratory When is this model simpler? 3.Most components do not interact directly: # parameters  4.Only one state transition, S  I, is affected by interactions: # parameters 

Architecture

Network Dynamics and Simulation Science Laboratory Computational Resources Demonstration experiment –8 experiments (exp ids: 1083 to 1090) –24 cells with 200 days and 25 reps Computations performed –291 million contacts * 200 days * 25 reps * 24 cells = quadrillion transmission evaluations Time Requirements –Single processor: 2 years 340 days –Small cluster (10 nodes, 4 cores): 26 days 18 hours –Current IDAC cluster: > 3 hours

Network Dynamics and Simulation Science Laboratory Example Located Synthetic Population

Example Route Plans second person in household first person in household

Network Dynamics and Simulation Science Laboratory Time Slice of a Typical Family’s Day

Network Dynamics and Simulation Science Laboratory Original research contributions are solicited covering a variety of topics including but not limited to: data gathering in complex system research, data mining of large set of data: methods and tools, dealing with personal data in complex system research, data sharing of proprietary data for research purposes; legal aspects of the use of proprietary data for research purposes. characteristics of data used in complex system research in the fields of biology, biomedicine, consumer preferences, finance, social networks, sociology, traffic problems and telecommunication.

Network Dynamics and Simulation Science Laboratory How much does detail matter? Interaction picture: –Dynamics of outbreak depend on topology –How and how much? –What differences in network topology are relevant to prevention/mitigation What statistics capture difference? Answer staring us in the face (see above): –Overall attack rate is a function of the topology of the network Other measures for other questions –Attack rate by transmissibility as function of edges retained –Vulnerability of a subset as function of edges retained –Distribution of vulnerabilities as function of edges retained How much does detail matter?

Network Dynamics and Simulation Science Laboratory Edge deletion in a graph RTI synthesized poultry farm network In collaboration with Upenn, studying outbreaks National network, essentially complete graph –Distribution of weights Attack rate as function of edges retained Attack rate by transmissibility as function of edges retained Vulnerability of a subset as function of edges retained Distribution of vulnerabilities as function of edges retained

Network Dynamics and Simulation Science Laboratory Model comparison Compare outcomes of same scenarios –Compare distributions of outcomes of similar scenarios –Compare distributions of summary statistics of outcomes of similar scenarios –Compare distributions of answers to questions about similar scenarios Compare

Network Dynamics and Simulation Science Laboratory Adds up to serious informatics challenge Managing the refinement process Integrating various data sources & simulations Curating the database Providing HPC services Providing analysis support