Presentation on theme: "Introduction to Monte Carlo Markov chain (MCMC) methods"— Presentation transcript:
1 Introduction to Monte Carlo Markov chain (MCMC) methods Lecture 3Introduction to Monte Carlo Markov chain (MCMC) methods
2 Lecture Contents How does WinBUGS fit a regression? Gibbs Sampling Convergence and burninHow many iterations?Logistic regression example
3 Linear regression Let us revisit our simple linear regression model To this model we added the following priors in WinBUGSIdeally we would sample from the joint posterior distribution
4 Linear Regression ctd.In this case we can sample from the joint posterior as described in the last lectureHowever this is not the case for all models and so we will now describe other simulation-based methods that can be used.These methods come from a family of methods called Markov chain Monte Carlo (MCMC) methods and here we will focus on a method called Gibbs Sampling.
5 MCMC Methods Goal: To sample from joint posterior distribution. Problem: For complex models this involves multidimensional integrationSolution: It may be possible to sample from conditional posterior distributions,It can be shown that after convergence such a sampling approach generates dependent samples from the joint posterior distribution.
6 Gibbs SamplingWhen we can sample directly from the conditional posterior distributions then such an algorithm is known as Gibbs Sampling.This proceeds as follows for the linear regression example:Firstly give all unknown parameters starting values,Next loop through the following steps:
7 Gibbs Sampling ctd. Sample from These steps are then repeated with the generatedvalues from this loop replacing the starting values.The chain of values produced by this procedure areknown as a Markov chain, and it is hoped that thischain converges to its equilibrium distribution whichis the joint posterior distribution.
8 Calculating the conditional distributions In order for the algorithm to work we need to sample from the conditional posterior distributions.If these distributions have standard forms then it is easy to draw random samples from them.Mathematically we write down the full posterior and assume all parameters are constants apart from the parameter of interest.We then try to match the resulting formulae to a standard distribution.
9 Matching distributional forms If a parameter θ follows a Normal(μ,σ2) distribution then we can writeSimilarly if θ follows a Gamma(α,β) distribution then we can write
13 Algorithm Summary Repeat the following three steps 1. Generate β0 from its Normal conditional distribution.2. Generate β1 from its Normal conditional distribution.3. Generate 1/σ2 from its Gamma conditional distribution
14 Convergence and burn-in Two questions that immediately spring to mind are:We start from arbitrary starting values so when can we safely say that our samples are from the correct distribution?After this point how long should we run the chain for and store values?
15 Checking Convergence This is the researchers responsibility! Convergence is to a target distribution (the required posterior), not to a single value as in ML methods.Once convergence has been reached, samples should look like a random scatter about a stable mean value.
16 ConvergenceConvergence occurs here at around 100 iterations.
17 Checking convergence 2One approach (in WinBUGS) is to run many long chains with widely differing starting values.WinBUGS also has the Brooks-Gelman-Rubin diagnostic which is based on the ratio of between-within chain variances (ANOVA). This diagnostic should converge to 1.0 on convergence.MLwiN has other diagnostics that we will cover on Wednesday.
18 Demo of multiple chains in WinBUGS Here we transfer to the computer for a demonstration with the regression example of multiple chains (also mention node info)
19 Demo of multiple chains in WinBUGS Average 80% interval within-chains (blue) and pooled 80% interval between chains (green) – converge to stable valuesRatio pooled:average interval width (red) – converge to 1.
20 Convergence in more complex models Convergence in linear regression is (almost) instantaneous.Here is an example of slower convergence
21 How many iterations after convergence? After convergence, further iterations are needed to obtain samples for posterior inference.More iterations = more accurate posterior estimates.MCMC chains are dependent samples and so the dependence or autocorrelation in the chain will influence how many iterations we need.Accuracy of the posterior estimates can be assessed by the Monte Carlo standard error (MCSE) for each parameter.Methods for calculating MCSE are given in later lectures.
22 Inference using posterior samples from MCMC runs A powerful feature of MCMC and the Bayesian approach is that all inference is based on the joint posterior distribution.We can therefore address a wide range of substantive questions by appropriate summaries of the posterior.Typically report either the mean or median of the posterior samples for each parameter of interest as a point estimate2.5% and 97.5% percentiles of the posterior sample for each parameter give a 95% posterior credible interval (interval within which the parameter lies with probability 0.95)
23 Derived QuantitiesOnce we have a sample from the posterior we can answer lots of questions simply by investigating this sample.Examples:What is the probability that θ>0?What is the probability that θ1> θ2?What is a 95% interval for θ1/(θ1+ θ2)?See later for examples of these sorts of derived quantities.
24 Logistic regression example In the practical that follows we will look at the following dataset of rat tumours and fit a logistic regression model to it:Dose levelNumber of ratsNumber with tumors1441342
25 Logistic regression model A standard Bayesian logistic regression model for this data can be written as follows:WinBUGS can fit this model but can we write out the conditional posterior distributions and use Gibbs Sampling?
26 Conditional distribution for β0 This distribution is not a standard distribution andso we cannot simply simulate from a standardrandom number generator. However bothWinBUGS and MLwiN can fit this model usingMCMC. We will however not see how until day 5.
27 Hints for the next practical In the next practical you will be creating WinBUGS code for a logistic regression model.In this practical you get less help and so I would suggest that looking at the Seeds example in the WinBUGS examples may help. The seeds example is more complicated than what you require but will be helpful for showing the necessary WinBUGS statements.