1 Francisco José Vázquez Polo [www.personales.ulpgc.es/fjvpolo.dmc] Miguel Ángel Negrín Hernández [www.personales.ulpgc.es/mnegrin.dmc] {fjvpolo or

Slides:



Advertisements
Similar presentations
Introduction to Monte Carlo Markov chain (MCMC) methods
Advertisements

Other MCMC features in MLwiN and the MLwiN->WinBUGS interface
Probabilistic models Jouni Tuomisto THL. Outline Deterministic models with probabilistic parameters Hierarchical Bayesian models Bayesian belief nets.
Markov Chain Monte Carlo Convergence Diagnostics: A Comparative Review By Mary Kathryn Cowles and Bradley P. Carlin Presented by Yuting Qi 12/01/2006.
Bayesian Estimation in MARK
Simulation Operations -- Prof. Juran.
Gibbs Sampling Qianji Zheng Oct. 5th, 2010.
Lecture 3 Nonparametric density estimation and classification
Introduction to Sampling based inference and MCMC Ata Kaban School of Computer Science The University of Birmingham.
CHAPTER 16 MARKOV CHAIN MONTE CARLO
Bayesian statistics – MCMC techniques
Suggested readings Historical notes Markov chains MCMC details
BAYESIAN INFERENCE Sampling techniques
. PGM: Tirgul 8 Markov Chains. Stochastic Sampling  In previous class, we examined methods that use independent samples to estimate P(X = x |e ) Problem:
Computational statistics 2009 Random walk. Computational statistics 2009 Random walk with absorbing barrier.
The University of Texas at Austin, CS 395T, Spring 2008, Prof. William H. Press IMPRS Summer School 2009, Prof. William H. Press 1 4th IMPRS Astronomy.
A Bayesian view of language evolution by iterated learning Tom Griffiths Brown University Mike Kalish University of Louisiana.
Using ranking and DCE data to value health states on the QALY scale using conventional and Bayesian methods Theresa Cain.
Applied Bayesian Analysis for the Social Sciences Philip Pendergast Computing and Research Services Department of Sociology
Computer vision: models, learning and inference Chapter 10 Graphical Models.
Everything you ever wanted to know about BUGS, R2winBUGS, and Adaptive Rejection Sampling A Presentation by Keith Betts.
Analyzing iterated learning Tom Griffiths Brown University Mike Kalish University of Louisiana.
Department of Geography, Florida State University
Image Analysis and Markov Random Fields (MRFs) Quanren Xiong.
Introduction to Monte Carlo Methods D.J.C. Mackay.
Computer Literacy BASICS
Bayes Factor Based on Han and Carlin (2001, JASA).
Input for the Bayesian Phylogenetic Workflow All Input values could be loaded as text file or typing directly. Only for the multifasta file is advised.
Introduction to WinBUGS Olivier Gimenez. A brief history  1989: project began with a Unix version called BUGS  1998: first Windows version, WinBUGS.
MICROSOFT WORD GETTING STARTED WITH WORD. CONTENTS 1.STARTING THE PROGRAMSTARTING THE PROGRAM 2.BASIC TEXT EDITINGBASIC TEXT EDITING 3.SAVING A DOCUMENTSAVING.
Introduction to MCMC and BUGS. Computational problems More parameters -> even more parameter combinations Exact computation and grid approximation become.
Priors, Normal Models, Computing Posteriors
A Beginner’s Guide to Bayesian Modelling Peter England, PhD EMB GIRO 2002.
2 nd Order CFA Byrne Chapter 5. 2 nd Order Models The idea of a 2 nd order model (sometimes called a bi-factor model) is: – You have some latent variables.
WinBUGS Demo Saghir A. Bashir Amgen Ltd, Cambridge, U.K. 4 th January 2001.
Productivity Programs Common Features and Commands.
Finding Scientific topics August , Topic Modeling 1.A document as a probabilistic mixture of topics. 2.A topic as a probability distribution.
1 Gil McVean Tuesday 24 th February 2009 Markov Chain Monte Carlo.
Downloading and Installing Autodesk Revit 2016
ECE 8443 – Pattern Recognition LECTURE 07: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Class-Conditional Density The Multivariate Case General.
Suppressing Random Walks in Markov Chain Monte Carlo Using Ordered Overrelaxation Radford M. Neal 발표자 : 장 정 호.
Downloading and Installing Autodesk Inventor Professional 2015 This is a 4 step process 1.Register with the Autodesk Student Community 2.Downloading the.
The generalization of Bayes for continuous densities is that we have some density f(y|  ) where y and  are vectors of data and parameters with  being.
Probabilistic models Jouni Tuomisto THL. Outline Deterministic models with probabilistic parameters Hierarchical Bayesian models Bayesian belief nets.
Markov Chain Monte Carlo for LDA C. Andrieu, N. D. Freitas, and A. Doucet, An Introduction to MCMC for Machine Learning, R. M. Neal, Probabilistic.
Summary Slide Printing Handouts Animations Slide Transitions Animate text Hyperlinks Action Buttons Adding sound to your PowerPoint presentationAdding.
CS774. Markov Random Field : Theory and Application Lecture 15 Kyomin Jung KAIST Oct
1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.
CS Statistical Machine learning Lecture 25 Yuan (Alan) Qi Purdue CS Nov
Bayesian Modelling Harry R. Erwin, PhD School of Computing and Technology University of Sunderland.
1 Getting started with WinBUGS Mei LU Graduate Research Assistant Dept. of Epidemiology, MD Anderson Cancer Center Some material was taken from James and.
Kevin Stevenson AST 4762/5765. What is MCMC?  Random sampling algorithm  Estimates model parameters and their uncertainty  Only samples regions of.
Academic Computing Services 2007 Microsoft Word 2010 Publishing Long Documents This Guide will teach you how to work with long documents such as dissertations.
SIR method continued. SIR: sample-importance resampling Find maximum likelihood (best likelihood × prior), Y Randomly sample pairs of r and N 1973 For.
How many iterations in the Gibbs sampler? Adrian E. Raftery and Steven Lewis (September, 1991) Duke University Machine Learning Group Presented by Iulian.
Hierarchical Models. Conceptual: What are we talking about? – What makes a statistical model hierarchical? – How does that fit into population analysis?
Generalization Performance of Exchange Monte Carlo Method for Normal Mixture Models Kenji Nagata, Sumio Watanabe Tokyo Institute of Technology.
Markov Chain Monte Carlo in R
Introduction to Sampling based inference and MCMC
MCMC Output & Metropolis-Hastings Algorithm Part I
Advanced Statistical Computing Fall 2016
Introducing Bayesian Approaches to Twin Data Analysis
Michael Epstein, Ben Calderhead, Mark A. Girolami, Lucia G. Sivilotti 
Markov chain monte carlo
Remember that our objective is for some density f(y|) for observations where y and  are vectors of data and parameters,  being sampled from a prior.
Predictive distributions
Course on Bayesian Methods in Environmental Valuation
Ch13 Empirical Methods.
Opinionated Lessons #39 MCMC and Gibbs Sampling in Statistics
Applied Statistics and Probability for Engineers
Presentation transcript:

1 Francisco José Vázquez Polo [ Miguel Ángel Negrín Hernández [ {fjvpolo or Course on Bayesian Methods Basics (continued): Models for proportions and means

Markov Chain Monte Carlo Methods Goals -To make inference about model parameters - To make predictions ¿Simulation? Posterior distribution is not analytically tractable

Monte Carlo integration and MCMC Monte Carlo integration -Draw independent samples from required distribution -Use sample averages to approximate expectations Markov Chain Monte Carlo (MCMC) - Draws samples by running a Markov chain that is constructed so that its limiting distribution is the joint distribution of interest

Markov chains A Markov chain is a sequence of random variables X 0, X 1, X 2,… At each time t ≥ 0 the next state X t+1 is sampled from a distribution P(X t+1 | X t ) that depends only on the state at time t - Called “transition kernel” Under certain regularity conditions, the iterates from a Markov chain will gradually converge to draws from a unique stationary or invariant distribution -i.e. chain will forget its initial state -As t increases, samples points X t will look increasingly like (correlated) samples from the stationary distribution

Suppose -MC is run for N (large number) iterations - We throw away output from first m iterations (burn-in) -Regularity conditions are met Then by ergodic theorem -We can use averages of remaining samples to estimate means Markov chains

Gibbs sampling: one way to construct the transition kernel Seminal references Geman and Geman (IEEE Trans. Pattn. Anal. Mach. Intel., 1984) Gelfand and Smith (JASA, 1990) Hasting (Biometrika, 1970) Metropolis, Rosenbluth, et al. (J. Chem. Phys, 1953)

Subject to regularity conditions, joint distribution is uniquely determined by “full conditional distributions” - Full conditional distribution for a model quantity is distribution of that quantity conditional on assumed known values of all the other quantities in the model Break complicated, high-dimensional problem into a large number of simpler, low-dimensional problems Gibbs sampling: one way to construct the transition kernel

Example: inference about normal mean and variance, both unknown. Model Priors We want posterior means, posterior means, posterior credible sets for µ, σ 2 Gibbs sampling: one way to construct the transition kernel

 ( , σ 2 |x)  h ((n+n 0 )/2-1) exp{(-1/2)[b 0 (  -a 0 ) 2 +s 0 h+h  i (x i -  )²]} Gibbs sampling: one way to construct the transition kernel Posterior distribution (Bayes’ theorem): This is not a known distribution

We can obtain the conditional distributions: Gibbs sampling: one way to construct the transition kernel  (  |h, x) = =  ( , h|x)  (h|x)  ( , h|x)  ( , h|x)d   ( , h|x)  (  |x)  ( , h|x)  ( , h|x)dh  (h| , x) = =

where,   (  |h, x)  exp{ } (b 0 +nh)(  - ) 2 2 a 0 b 0 +hn b 0 +nh a 0 b 0 +hn b 0 +nh 1 ~ N(, ) 2 n 0 +n (s 0 +  i (x i -  )²) 2 ~ G(, ) h exp{- ·h} n 0 +n 2 (s 0 +  i (x i -  )²) 2   (h| , x)  Gibbs sampling: one way to construct the transition kernel

Gibbs sampler algorithm for Normal 1.Choose initial values µ (0) σ 2(0) 2.At each iteration t, generate new value for each parameter, conditional on most recent value of all other parameters Gibbs sampling: one way to construct the transition kernel

Gibbs sampling: Step 0. Initial values:  (0) = (  0, h 0 ) Step 1. How to generate the next simulation  (1) = (  1, h 1 ): We can simulate  1 from  (  |h=h 0, x), (Normal distribution) We can simulate h1 from  (h|  =  1, x), (Gamma distribution) We can update (  0, h 0 ) to (  1, h 1 ), ······ Step k. Update  (k) = (  k, h k ), from  (k-1). Gibbs sampling: one way to construct the transition kernel

Other MCMC techniques - Metropolis Hasting Metropolis et al. (1953) and Hasting (1970), Tierney (1994), Chib and Greenberg (1995), Robert and Casella (1999) - Data augmentation Tanner and Wrong (1987)

What is WinBUGS? “Bayesian inference Using Gibbs Sampling” General purpose program for fitting Bayesian models Developed by David Spiegelhater, Andrew Thomas, Nicky Best, and Wally Gilks at the Medical Research Council Biostatistics Unit, Institute of Public Health, in Cambridge, UK Excellent documentation, including two volumes of examples Web page: Software: WinBUGS

What does WinBUGS do? Enable user to specify model in simple language Construct the transition kernel of a Markov chain with the joint posterior as its stationary distribution, and simulate a sample path of the resulting chain -Generate random variables from standard densities using standard algorithms. -WinBUGS uses Metropolis algorithm to generate from nonstandard full conditionals

Topics. WinBUGS Deciding how many chains to run Choosing initial values - Do not confuse initial values with priors! -Priors are part of the model. Initial values are part of the computing strategy used to fit the model -Priors must not be based on the current data. -The best choices of initial values are values that are in a high-posterior-density region of the parameter space. If the prior is not very strong, then maximum likelihood estimates (from the current data) are excellent choices of initial values if they can be calculated.

Initial values Initial values are not like priors! Choosing initial values - Run more than one chain with initial values selected to give you information about sampler performance - Initial values may be generated from priors - Initial values may be based on frequentist estimates.

-WinBUGS usually can automatically generate initial values for other parameters -But it’s often advantageous to specify even those WinBUGS can generate. Initial values must be specified for variance components Initial values

In the simple models we have encountered so far, the MCMC sampler will converge quickly even with a poor choice of initial values. -In more complicated models, choosing initial values in low posterior density regions may make the sampler take a huge number of iterations to finally star drawing from a good approximation to the true posterior. “Assessing, whether sampler has converged”. -How many initial iterations need to be discarded in order that remaining samples are drawn from a distribution close enough to the true stationary distribution to be usable for estimation and inference? Topics. WinBUGS

Convergence assessment How many initial iterations need to be discarded in order that remaining samples are drawn form a distribution close enough to the true stationary distribution to be usable for estimation and inference? -“burn-in”  (0),  (1),...,  (m),  (m+1),...,  (N). ”burn in” sample

-Once we are drawing from the right distribution, how many samples are needed in order to provide the desired precision in estimation and inference? Choosing model parameterizations and MCMC algorithms that will lead to convergence, in a reasonable amount of time. Topics. WinBUGS N-m ²² Monte Carlo error:

On “stats” output Similar to standard error of the mean but adjusted for autocorrelated sample It will get smaller as more iterations are run Use it to guide your decisions as to how many iterations you need to run after burn-in is done Monte Carlo errors

Decide whether to - Discard some initial burn-in iterations and use remaining sampler output for inference - or take some corrective action Run more iterations Change parameterization of model

Using MCMC for Bayesian inference 1.Specify a Bayesian model 2.Construct a Markov chain whose target distribution is the joint posterior distribution of interest 3.Run one or more chain(s) as long as you can stand it 4.Assess convergence - Burn-in - Monte Carlo error 5. Inference

History plots: Early graphical methods of convergence assessment Trajectories of sampler output for each model unknown Can quickly reveal failure to reach stationarity Can give qualitative information about sampler behaviour Cannot confirm that convergence have occurred Using MCMC for Bayesian inference

Monitor -All types of model parameters, not only parameters of substantive interest -Sample paths graphically -Autocorrelations -Cross-correlations between parameters Apply more than one diagnostic, including one or more that uses information about the specific model.

Conclusions MCMC methods have enabled the fitting of complex, realistic models. Use of MCMC methods requires careful attention to -Model parameterization -MCMC sampler algorithms -Choice of initial values -Convergence assessment -Output Analysis Ongoing research in theoretical verification of convergence, MCMC acceleration, and exact sampling holds great promise.

WinBUGS

Where to get the WinBUGS software From the web site of the MRC Biostatistics Unit in Cambridge. The BUGS home page is Once you have downloaded the files you need to the BUGS project for a key that will let you use the full version. Manuals -The WinBUGS manual available online. -WinBUGS examples, volumes 1 and 2 with explanations, available online.

Specifying the model The first stage in model fitting is to specify your model in the BUGS language. This can either be done graphically (and then code written form your graph) or in the BUGS language. Graphically – DOODLE

Specifying the model Starting WinBUGS -Click on the WinBUGS icon or program file. You will get a message about the license conditions that you can read, and then close. Now explores the menus. -HELP – you see manuals and examples. -FILE – allows you to open existing files, or to start a new file to program your own example in the BUGS language -DOODLE – NEW is what you need to use to start a file for a new model specified in graphical format

Specifying the model Example – a simple proportion Inference is required for the proportion (pi) of people who are willing to pay a certain amount to visit a park. X = (0,0,0,0,1,0,0,0,1,1,1,0,1,1,1) ( 1: yes, 0: no) Sample size = 15 x 1, x 2,..., x n iid ~ Bin(1,  ) or Bernouilli(  )

Example – a simple proportion Press ‘ok’ Starting WinBUGS DODDLE - NEW

Example – a simple proportion Basic instructions Click left mouse button to create a “doodle” with Click CTRL + Supr to delete a “doodle” To create a “plate” press CLICK + CTRL Click CTRL + Supr to delete a “plate”

Nodes:Constants Stochastic nodes Deterministic nodes The relationship between node: Arrows stochastic dependence logical function To create a narrow we have to illuminate the “destiny” node and the press CTRL + CLICK on the “origin” node. If we repeat the process we have to repeat the process. Example – a simple proportion

We have to create a “node” for each variable or constant included In the model: An oval for stochastic nodes (we choose the density and the parameters) A rectangle for the constants 4Nodes: 4Plate for the node x i

 Now we add the arrows that represents the relations between nodes: Example – a simple proportion Copy and paste the doodle in a new file

Example – a simple proportion Data Now you will have to enter the data.The following list format will do (notice that WinBUGS is case sensitive with the capital letters). Initial values (opcional, WinBUGS can generate them) list(n = 15, alpha = 1, beta = 1, x=c(0, 0, 0, 0, 1,...)) list(phi =0.1)

Example – a simple proportion Specifying the model The first stage in model fitting is to specify your model in the BUGS language (doodle or code). You then go to the model menu and you will get a specification tool with buttons. Click on the check model button. Any error messages will appear in the grey bar at the bottom of the screen. If all is well, you will get the message “model is syntactically correct”. You will aso see that the compile and load data buttons will have their text darkened. You are ready for the next step.

Example – a simple proportion Load Data Data can be entered either in list format or in rectangular format (columns). Once you have typed in your data, highlight either the word list. You can use the load data button on the specification tool to read the data. Compile You are now ready to compile your model using compile button on the specification tool. Again, check error messages. If all is well you will get a “model compiled” message.

Example – a simple proportion Initial values All nodes that are not given as data, or derived from other nodes, need to be given initial values. This is done with the specification tool menu either by setting them specifically form values you type in (set inits button) or by generating a random value form the prior (gen inits button). WinBUGS 1.4 allows you to run more than one chain at the same item; see specification tool above. If you want more chains you will need to set different initial values for each one.

Example – a simple proportion Generating the samples - updating You are now ready to generate samples and to examine the simulated output. To start the sampler, go to model and then update and you will get an updating tool. You can select how many updates you get for each press of the update button and how often the screen is refreshed to show how sampling is proceeding. Updating does the sampling, but does not store any values. In MCMC methods you usually want to run the sampler for some time (perhaps iterations) to be sure it is stable before you start storing values.

Example – a simple proportion Storing values and summarising results After an initial run values go to the inference menu and samples and you will get a sample monitoring tool. You start by entering the parameters you want to monitor in the node box, and for each one press ser. If you also press trace you will see a plot of the samples as they are generated. Now go back to the updating tool and generate some samples.

Example – a simple proportion Now go back to the sampling tool to look at the various ways of displaying your results or summary statistics. The most useful buttons are: history – shows you a plot of all the samples you have generated density – gives a kernel density estimate of the posterior stats – gives summary statistics including mean, s.d., median and percentiles that can be set with the panel on the right. These can be used for credible intervals. You will also get a MonterCarlo error for the mean that will indicate how well the mean of the posterior has been estimated form your number of samples. AutoC – a plot of the autocorrelation in the chains

Ü “click” in “stats”: mean = (real mean= ) median = Bayesian interval 95% = (0.246, ) MC error = Example – a simple proportion Results: Ü “click” in “density”: Ü “click” in “history”: (all the chain)

Example – a simple proportion Exercise: Repeat the analysis new simulations

Example – a simple proportion Code: Doodle – Write Code