Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data and Statistics: New methods and future challenges Phil O’Neill University of Nottingham.

Similar presentations


Presentation on theme: "Data and Statistics: New methods and future challenges Phil O’Neill University of Nottingham."— Presentation transcript:

1 Data and Statistics: New methods and future challenges Phil O’Neill University of Nottingham

2 Professors: How they spend their time

3

4 1. High-resolution genetic data 2. Model assessment

5

6 Gardy 2011 NEJM

7 “High-resolution genetic data”: what are they?  individual-level data on the pathogen  can be taken at single or multiple time points  high-dimensional e.g. whole genome sequences  proportion of individuals sampled could be high/low  becoming far more common due to cost reduction

8 “High-resolution genetic data”: what use are they?  better inference about transmission paths  more reliable estimates of epi quantities?  understand evolution of the pathogen

9 .

10 . A C C C T T G G G A A A.....

11 Modelling and Data Analysis methods Two kinds of approaches exist: 1. Separate genetic and epidemic components (e.g. Volz, Rasmussen) 2. Combine genetic and epidemic components (e.g. Ypma, Worby, Morelli)

12 1. Separate genetic and epidemic components e.g: - estimate phylogenetic tree - given the tree, fit epidemic model or - cluster individuals into genetically similar groups - given the groups, fit multi-type epidemic model

13 1. Separate genetic and epidemic components + “Simple” approach + Avoids complex modelling - Ignores any relationship between transmission and genetic information

14 2. Combine genetic and epidemic components e.g: - model genetic evolution explicitly - define model featuring both genetic and epidemic parts

15 2. Combine genetic and epidemic components + “Integrated” approach - Is modelling too detailed? - Initial conditions: typical sequence? +/- Model differences between individuals instead?

16 1. High-resolution genetic data 2. Model assessment

17 “Model assessment”: what is it?  Does our model fit the data?  Is there a better model?

18 “Model assessment”: why do it?  Poor fit sheds doubt on conclusions from modelling  Model choice can be a tool for directly addressing questions of interest

19 Linear regression: y k = ax k + b + e k, e k ~ N(0,v) Minimise distance of model mean from observed data

20

21 For outbreak data:  What are the right residuals?  Should observed or unobserved data be compared to the model? (Streftaris and Gibson)  Mean model may only be available via simulation  Is the mean the right quantity to consider?

22

23 Simulation-based approaches to model fit:  Forward simulation – “close” to data?  Choice of summary statistics?  Close ties to ABC methods (McKinley, Neal)

24 Approaches to model choice  Hypermodels/saturated models  Bayesian non-parametric methods  Bayesian methods e.g. RJMCMC  Mixture models

25  Hypermodels/saturated models e.g. Infection rates βS or βSI or βSI 0.5 in an SIR model? Instead use βSI  and estimate  (O’Neill and Wen)

26  Bayesian non-parametric methods e.g. Infection rate β(t)SI or β(t) in an SIR model; Estimate β(t) in a Bayesian non-parametric manner using Gaussian process machinery (Kypraios, O’Neill and Xu; Knock and Kypraios)

27

28  Reversible Jump MCMC e.g. Distinct models (usually small number), estimate Bayes factors by running MCMC on union of parameter spaces (O’Neill; Neal and Roberts; Knock and O’Neill)

29  Mixture models e.g. Given two models (f, g), create mixture model f(x) =  g(x) + (1-  ) h(x); estimation of  enables estimation of Bayes Factors (Kypraios and O’Neill)

30


Download ppt "Data and Statistics: New methods and future challenges Phil O’Neill University of Nottingham."

Similar presentations


Ads by Google