Presentation on theme: "Characterizing Infectious Disease Outbreaks: Traditional and Novel Approaches Laura F White 15 October 2013."— Presentation transcript:
Characterizing Infectious Disease Outbreaks: Traditional and Novel Approaches Laura F White 15 October 2013
2009 Influenza A H1N1 Pandemic H1N1 pandemic first noticed in February in Mexico. Large outbreak early on in La Gloria-a small village outside of Mexico City. Studied extensively in the first report on H1N1 (Fraser, Donelly et al. “Pandemic potential of a strain of Influenza (H1N1): early findings”, Science Express, 11 May 2009.)
Edgar Hernandez (four years old): first confirmed case
Cases reported in La Gloria
Quantitative Issues How do we determine how fast the disease is spreading? – Reproductive number, serial interval How do we determine how severe the disease is? – Attack rate, case fatality ratio – A topic for another talk! How do we determine what interventions will be most effective? – Mathematical modeling, network models, etc. – Estimates of severity and transmission by age group
Importance of parameter estimates Good information leads to good policy. School closure is expensive – Important to determine if it will really help. If R 0 < 2, some estimate that Influenza can be controlled. Information on R 0 and the serial interval can give a good picture of how a disease might spread.
Source: Fraser et al (2004)
Impact of the serial interval
Some of the challenges in infectious diseases Dependency in the data. – Chain of infection. Undetected cases. – Asymptomatic, but still infectious. – Unable to detect with existing surveillance. Need to act fast with little information.
Simple approach Assume exponential growth for the first part of an epidemic. t d is the doubling time of the epidemic, D is the average serial interval. Then use the following to solve for R 0. Overly simplistic and sensitive.
Mathematical models Susceptible Infected Recovered R 0 =(attack rate)(contact rate)(duration of infectiousness) SIR Model (Contact Rate)*(Transmission Probability)Infected 1/(duration of infectiousness)
Mathematical Models-Uses Modeling vaccination programs Determining optimal intervention strategies for halt or control an epidemic HIV transmission routes Estimating parameters of disease
Mathematical Models: Limitations Make a lot of assumptions. – Must plug in a lot of values in order to get estimates. Do not allow for randomness in processes-always gives a number as the answer with no error bounds. – Stochastic epidemic model. Can oversimplify the problem. – Challenge to achieve balance between making the model too simple and too complex.
References Hethcote – The Mathematics of Infectious Diseases. Herbert W. Hethcote. SIAM Review, Vol. 42, No. 4, Dec., Anderson and May – Infectious Diseases of Humans: Dynamics and Control, Oxford University Press, 1992.
Wallinga & Tuenis Network based method to estimate the reproductive number each day of an epidemic. Requires knowledge of the serial interval. Requires that all cases have been observed and epidemic is over. Originated to analyze SARS. American Journal of Epidemiology, 2004
Day 1 Day 2 Day 4 Day 5 Day 6 Day 3 = infected person
Day 1 Day 2 Day 4 Day 5 Day 6 Day 3 All possible infectors.
j i Day 1 Day 2 Day 4 Day 5 Day 6 Day 3 p1p1 p1p1 p1p1 p1p1 p2p2 p3p3 p t =probability of being infected by a case that appeared t days prior.
Wallinga & Teunis If g(t) is the distribution of the serial interval, then, the relative probability that case i has been infected by case j is: The effective reproductive number for cases on day j is then:
WT - SARS
White & Pagano Statistical method, using probability models to estimate the serial interval and reproductive number. Assume that we observe daily counts of new cases:. Let X ij be the number of cases with symptoms on day j that were infected by a case with symptoms on day i. Statistics in Medicine, 2008
White & Pagano
Method Using this scheme, we make some probabilistic assumptions and get a likelihood equation: Where p j describes the serial interval (i.e. probability of having symptoms j days after infector). Use numerical methods to get MLEs of R o and p.
H1N1 Example In April the public became aware of a novel strain of Influenza that was affecting Mexico. Fraser, Donelly et al published initial report in Science on 11 May Estimate the reproductive number to be between 1.4 and 1.6. Estimate the average serial interval to be 1.91 days.
H1N1 Example We obtained data from the CDC with information on each confirmed and suspected case (1368 cases) as of May had a date of symptom onset.
Influenza A/H1N1: Serial Interval Spanish work estimate average serial interval to be 3.5 days, range=1-6 days. – Use contact tracing data. Seasonal influenza (Cowling et al, 2009) – 3.6 days, SD=1.6 – From a household contact study
Influenza A/H1N1: R 0 estimates Mexico: (Cruz-Pacheco et al) Mexico: less than (Boelle et al) Japan: 2.3 (Nishiura et al) Netherlands: less than 1 (Hahne et al) US: (White et al)
Influenza A/H1N1: USA
Missing dates of symptom onset – All cases have report date but many lack date of symptom onset. – Calculate the distribution of time between reported date and symptom onset for those with both. – Impute a date of symptom onset for those with missing information from the observed distribution.
Reporting delay distribution
Other issues in the data Imported cases – Make an adjustment in the estimation method to account for those who were known to have traveled to Mexico. Reporting delay – The decline in cases as it gets closer to May 8 is likely due to reporting delays, rather than a true drop off in case numbers. – Augment the data at the end, using the reporting delay distribution.
Estimates in the USA Using the White & Pagano Method with the modifications mentioned we get estimates for R 0 and the serial interval in the initial outbreak in the US.
Serial interval estimate Using data up to and including April 27, Using data up to and including April 25, 2009.
Heterogeneity Variation in transmission between adults and kids, geographically, etc. Can lead to better policy decisions – Who gets vaccinated first? – Social distancing measures that might be most effective?
Overview Social mixing matrices Glass method Modification of Wallinga and Teunis Modification of White and Pagano
Social mixing To understand who is most culpable for transmission, we typically need to understand how people interact Many approaches to this, but we choose most popular currently: social mixing matrices
PolyMod study Large European study – Belgium, Finland, Great Britain, Germany, Italy, Luxembourg, the Netherlands, and Poland 97,904 contacts among 7,290 participants Participants record number and nature of contacts in a diary Contact matrices were created to describe all close contacts and separately, close contacts that involve physical touch
Table 1. Mossong J, Hens N, Jit M, Beutels P, et al. (2008) Social Contacts and Mixing Patterns Relevant to the Spread of Infectious Diseases. PLoS Med 5(3): e74. doi: /journal.pmed
Figure 1. The Mean Proportion of Contacts That Involved Physical Contact, by Duration, Frequency, and Location of Contact in All Countries Mossong J, Hens N, Jit M, Beutels P, et al. (2008) Social Contacts and Mixing Patterns Relevant to the Spread of Infectious Diseases. PLoS Med 5(3): e74. doi: /journal.pmed
Figure 2. The Distribution by Location and by Country of (A) All Reported Contacts and (B) Physical Contacts Only Mossong J, Hens N, Jit M, Beutels P, et al. (2008) Social Contacts and Mixing Patterns Relevant to the Spread of Infectious Diseases. PLoS Med 5(3): e74. doi: /journal.pmed
Figure 3. Smoothed Contact Matrices for Each Country Based on (A) All Reported Contacts and (B) Physical Contacts Weighted by Sampling Weights Mossong J, Hens N, Jit M, Beutels P, et al. (2008) Social Contacts and Mixing Patterns Relevant to the Spread of Infectious Diseases. PLoS Med 5(3): e74. doi: /journal.pmed
Other studies Similar studies have been conducted in South Africa and Vietnam First of this nature in Netherlands (Wallinga et al, 2006) Johnstone-Robertson et al (2011) carried out a very similar study in a South African township
Approaches Glass et al, 2011 – Estimate R for adults and children – Do not require transmission data Modify Wallinga and Teunis method – Estimate R t (and R 0 ) across age groups. – Require contact information. Moser and White, 2013 (in preparation) – Bayesian approach to the problem – Modify White & Pagano method to incorporate age contact information – Incorporate contact information as a prior distribution
Approach 1: Glass et al Modify Wallinga & Teunis and White & Pagano methods to estimate R for children and adults Assume a form for a reproduction matrix: m ij describes the number of cases of type i infected by cases of type j. Some pre-specified structure must be imposed on the matrix M must be assumed to estimate the m ij. m CC m CA m AC m AA M=
Matrix constraints Source: Glass et al, 2011
Modification of White & Pagano
The likelihood used is: Maximize this over the m ij to obtain estimates. Applying constraints to M, creates relationships between the m ij and they become identifiable.
Modification of Wallinga & Teunis
Approach 1: simulation study True R C =2.5 and true R A =1. L, M and U are 3 rd, median and 98 th percentiles over 100 simulations.
Approach 1: Japanese influenza data Wallinga & Teunis Approach
Approach 1: Japanese influenza data White & Pagano Method
APPROACH 2: MODIFICATION OF WALLINGA AND TEUNIS Heterogeneity
Approach 2: modification of Wallinga & Teunis Source: White, Archer and Pagano (submitted, 2013) Similar to Glass et al, allow the probability of infection to be impacted by more than just distance apart in time where is the probability of a serial interval of length j-i and is a similarity measure (similar to the matrices used by Glass et al).
Approach 2: modification of Wallinga & Teunis Similar to Glass et al, but we do not assume any structure on a similarity matrix, D=(d ij ). We use available data to define this matrix and are able to obtain estimates of R j for a large number of age groups (or spatial locations, etc.)
Similarity measures Individuals who are “close” together are more likely to infect each other have larger similarity measures. Can be used to address probability of infection between different geographical regions, age groups, etc.
Similarity measures Use a matrix to define the similarity measure. X ij describes the amount of contact individuals in group i have with those in group j. Age group 1 Age group 2 Age group 3 Age group 1 x 11 x 12 x 13 Age group 2 x 21 x 22 x 23 Age group 3 x 31 x 32 x 33 Similarity Matrix
Basic similarity measures Matrix of all 1’s: original estimator – Implies that transmission is equally likely among all individuals Diagonal matrix: transmission only occurs within homogenous groups (no mixing) – Comparable to applying original method to each homogenous group separately Can also use matrix that describes contact patterns
Example: Pandemic Influenza In South Africa Source: Archer et al (2009) Between 6/15/2009 and 11/23/2009 there were 12,630 confirmed cases
Age Analysis JSM 2012 We restrict our attention to Gauteng Province (the most populous) to limit geographic effects Use two sources of information on contact patterns between age groups: – PolyMod Study (Mossong et al, 2009) – Study in South African township (Johnstone- Robertson, 2011)
PolyMod contact trace matrix Great Britain, all contacts
South African township contact matrix Source: Johnstone-Robertson et al, AJE, 2011
Estimate of R t
Estimate of R t (a) All contacts involving physical touch; (b) all close contacts
Estimates of R 0 by age group (a) All contacts involving physical touch; (b) all close contacts
APPROACH 3: MODIFICATION OF WHITE AND PAGANO Heterogeneity
Moser and White Modification of the White and Pagano method to estimate R 0 and incorporate heterogeneity in the population Revise the likelihood to incorporate heterogeneity in the reproductive numbers Consider the scenario where we look at adults and kids only (2 group scenario) – R A and R C are the reproductive numbers for adults and children, respectively
Moser and White Reparameterize the problem to allow for inclusion of contact matrix information – q hg is the probability that individual of type h has contact with individual of type g – Example: R CA = q CA *R C – R C =R CA +R CC
Day 0: N 0C N 0A Day 1: N 1C = X C 0C1 + X A 0C1 N 1A = X C 0A1 + X A 0A1 Day 2: N 2C = X C 0C2 + X A 0C2 + X C 1C2 + X A 1C2 N 2A = X C 0A2 + X A 0A2 + X C 1A2 + X A 1A2 Day 3: N 3C = X C 0C3 + X A 0C3 + X C 1C3 + X A 1C3 + X C 2C3 + X A 2C3 N 3A = X C 0A3 + X A 0A3 + X C 1A3 + X A 1A3 + X C 2A3 + X A 2A3 Day 4: N 4C = + X C 1C4 + X A 1C4 + X C 2C4 + X A 2C4 + X C 3C4 + X A 3C4 N 4A = + X C 1A4 + X A 1A4 + X C 2A4 + X A 2A4 + X C 3A4 + X A 3A4 …….…………. Day T: N TC N TA Derivation of Likelihood Function Two Group Example
Day 0: N 0C N 0A Day 1: N 1C = X C 0C1 + X A 0C1 N 1A = X C 0A1 + X A 0A1 Day 2: N 2C = X C 0C2 + X A 0C2 + X C 1C2 + X A 1C2 N 2A = X C 0A2 + X A 0A2 + X C 1A2 + X A 1A2 Day 3: N 3C = X C 0C3 + X A 0C3 + X C 1C3 + X A 1C3 + X C 2C3 + X A 2C3 N 3A = X C 0A3 + X A 0A3 + X C 1A3 + X A 1A3 + X C 2A3 + X A 2A3 Day 4: N 4C = + X C 1C4 + X A 1C4 + X C 2C4 + X A 2C4 + X C 3C4 + X A 3C4 N 4A = + X C 1A4 + X A 1A4 + X C 2A4 + X A 2A4 + X C 3A4 + X A 3A4 …….…………. Day T: N TC N TA 3 Day Serial Interval X C 0A2 = Adults infected on day 2 by a child from day 0
Day 0: N 0C N 0A Day 1: N 1C = X C 0C1 + X A 0C1 N 1A = X C 0A1 + X A 0A1 Day 2: N 2C = X C 0C2 + X A 0C2 + X C 1C2 + X A 1C2 N 2A = X C 0A2 + X A 0A2 + X C 1A2 + X A 1A2 Day 3: N 3C = X C 0C3 + X A 0C3 + X C 1C3 + X A 1C3 + X C 2C3 + X A 2C3 N 3A = X C 0A3 + X A 0A3 + X C 1A3 + X A 1A3 + X C 2A3 + X A 2A3 Day 4: N 4C = + X C 1C4 + X A 1C4 + X C 2C4 + X A 2C4 + X C 3C4 + X A 3C4 N 4A = + X C 1A4 + X A 1A4 + X C 2A4 + X A 2A4 + X C 3A4 + X A 3A4 …….…………. Day T: N TC N TA RCRC RCRC RCRC RCRC RARA RARA RARA RARA Is Mixing Assortative? X C 0A2 = Adults infected on day 2 by a child from day 0
Day 0: N 0C N 0A Day 1: N 1C = X C 0C1 N 1A = X C 0A1 Day 2: N 2C = X C 0C2 N 2A = X C 0A2 Day 3: N 3C = X C 0C3 N 3A = X C 0A3 Day 4: N 4C = N 4A = …….…………. Day T: N TC N TA R CA R CC Is Mixing Assortative? X C 0A2 = Adults infected on day 2 by a child from day 0
Updated Likelihood The likelihood can be written as: where N tg is the number of cases on day t from group g. How do we maximize this likelihood?
Estimation We could try a frequentist approach, but there are issues with identifiability – We have four parameters to estimate and, similar to Glass et al, would need to impose constraints on the q’s in order to get estimates. Alternative approach: MCMC with prior information – Use contact frequency matrices from survey data to inform the priors of the q’s
Issues Reporting differences across age groups – How might this impact our results? – Example: kids are much more likely to show up at the clinic and have their cases reported. Adults are more likely to stay home. Non-uniformity of contact patterns globally? Other issues?
Final thoughts Quantitative methods are essential to informing policy decisions in a disease outbreak Issues we want to address: – Severity – Transmissibility – Heterogeneity – Uncertainty Challenges with dependency in the data, unobserved events, etc.
Thanks! Funding source: National Institute Of General Medical Sciences of the National Institutes of Health under Award Number U54GM