1 Getting started with WinBUGS Mei LU Graduate Research Assistant Dept. of Epidemiology, MD Anderson Cancer Center Some material was taken from James and.

Slides:



Advertisements
Similar presentations
Introduction to Monte Carlo Markov chain (MCMC) methods
Advertisements

Other MCMC features in MLwiN and the MLwiN->WinBUGS interface
“Students” t-test.
Markov Chain Monte Carlo Convergence Diagnostics: A Comparative Review By Mary Kathryn Cowles and Bradley P. Carlin Presented by Yuting Qi 12/01/2006.
Bayesian inference “Very much lies in the posterior distribution” Bayesian definition of sufficiency: A statistic T (x 1, …, x n ) is sufficient for 
Statistical Techniques I EXST7005 Lets go Power and Types of Errors.
1 Graphical Diagnostic Tools for Evaluating Latent Class Models: An Application to Depression in the ECA Study Elizabeth S. Garrett Department of Biostatistics.
Bayesian statistics – MCMC techniques
Computing the Posterior Probability The posterior probability distribution contains the complete information concerning the parameters, but need often.
Chapter 8 Estimating Single Population Parameters
Introduction to Hypothesis Testing
Inference.ppt - © Aki Taanila1 Sampling Probability sample Non probability sample Statistical inference Sampling error.
Using ranking and DCE data to value health states on the QALY scale using conventional and Bayesian methods Theresa Cain.
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc. Chapter Hypothesis Tests Regarding a Parameter 10.
Inferences About Process Quality
Department of Geography, Florida State University
Inference for regression - Simple linear regression
HAWKES LEARNING SYSTEMS math courseware specialists Copyright © 2010 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Chapter 14 Analysis.
Sampling Theory Determining the distribution of Sample statistics.
Introduction to WinBUGS Olivier Gimenez. A brief history  1989: project began with a Unix version called BUGS  1998: first Windows version, WinBUGS.
8.1 Inference for a Single Proportion
Introduction to MCMC and BUGS. Computational problems More parameters -> even more parameter combinations Exact computation and grid approximation become.
Bayesian Inference, Basics Professor Wei Zhu 1. Bayes Theorem Bayesian statistics named after Thomas Bayes ( ) -- an English statistician, philosopher.
Topics: Statistics & Experimental Design The Human Visual System Color Science Light Sources: Radiometry/Photometry Geometric Optics Tone-transfer Function.
A Beginner’s Guide to Bayesian Modelling Peter England, PhD EMB GIRO 2002.
WinBUGS Demo Saghir A. Bashir Amgen Ltd, Cambridge, U.K. 4 th January 2001.
10.2 Tests of Significance Use confidence intervals when the goal is to estimate the population parameter If the goal is to.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 6 The Standard Deviation as a Ruler and the Normal Model.
A Comparison of Two MCMC Algorithms for Hierarchical Mixture Models Russell Almond Florida State University College of Education Educational Psychology.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
1 Chapter 10: Introduction to Inference. 2 Inference Inference is the statistical process by which we use information collected from a sample to infer.
Lecture 16 Section 8.1 Objectives: Testing Statistical Hypotheses − Stating hypotheses statements − Type I and II errors − Conducting a hypothesis test.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 10 Comparing Two Populations or Groups 10.1.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 22 Comparing Two Proportions.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 10 Comparing Two Populations or Groups 10.1.
6.1 Inference for a Single Proportion  Statistical confidence  Confidence intervals  How confidence intervals behave.
Statistics : Statistical Inference Krishna.V.Palem Kenneth and Audrey Kennedy Professor of Computing Department of Computer Science, Rice University 1.
Latent Class Regression Model Graphical Diagnostics Using an MCMC Estimation Procedure Elizabeth S. Garrett Scott L. Zeger Johns Hopkins University
Basics of Biostatistics for Health Research Session 1 – February 7 th, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
Item Parameter Estimation: Does WinBUGS Do Better Than BILOG-MG?
Sampling Design and Analysis MTH 494 Lecture-21 Ossam Chohan Assistant Professor CIIT Abbottabad.
Bayesian Modelling Harry R. Erwin, PhD School of Computing and Technology University of Sunderland.
Learning Objectives After this section, you should be able to: The Practice of Statistics, 5 th Edition1 DESCRIBE the shape, center, and spread of the.
Chapter 9: Introduction to the t statistic. The t Statistic The t statistic allows researchers to use sample data to test hypotheses about an unknown.
Uncertainty and confidence Although the sample mean,, is a unique number for any particular sample, if you pick a different sample you will probably get.
Sampling Theory Determining the distribution of Sample statistics.
The inference and accuracy We learned how to estimate the probability that the percentage of some subjects in the sample would be in a given interval by.
Copyright © 2010 Pearson Education, Inc. Slide
Hierarchical Models. Conceptual: What are we talking about? – What makes a statistical model hierarchical? – How does that fit into population analysis?
Sampling Distributions
Chapter Nine Hypothesis Testing.
Two-Sample Hypothesis Testing
CHAPTER 10 Comparing Two Populations or Groups
Chapter 8: Inference for Proportions
CHAPTER 10 Comparing Two Populations or Groups
Combining Random Variables
Hypothesis Testing: Hypotheses
Markov Chain Monte Carlo
Ch13 Empirical Methods.
CHAPTER 10 Comparing Two Populations or Groups
Let’s do a Bayesian analysis
CHAPTER 9 Testing a Claim
Chapter 7: Sampling Distributions
CHAPTER 10 Comparing Two Populations or Groups
CHAPTER 10 Comparing Two Populations or Groups
CHAPTER 10 Comparing Two Populations or Groups
CHAPTER 10 Comparing Two Populations or Groups
CHAPTER 10 Comparing Two Populations or Groups
CHAPTER 10 Comparing Two Populations or Groups
Presentation transcript:

1 Getting started with WinBUGS Mei LU Graduate Research Assistant Dept. of Epidemiology, MD Anderson Cancer Center Some material was taken from James and Thomas, Dept. of Geography, Florida State University.

2 Background to BUGS ● The BUGS (Bayesian inference Using Gibbs Sampling) project is concerned with flexible software for the Bayesian analysis of complex statistical models using Markov chain Monte Carlo (MCMC) methods. The project began in 1989 in the MRC Biostatistics Unit and led initially to the `Classic' BUGS program, and then onto the WinBUGS software developed jointly with the Imperial College School of Medicine at St Mary's, London. Development now also includes the OpenBUGS project in the University of Helsinki, Finland. There are now many versions of BUGS, which can be confusing. `Classic' BUGSWinBUGSOpenBUGS

3 Introduction to WinBUGS ● WinBUGS is part of the BUGS project, which aims to make practical MCMC methods available to applied statisticians. WinBUGS can use either a standard `point-and-click' windows interface for controlling the analysis, or can construct the model using a graphical interface called DoodleBUGS. WinBUGS is a stand-alone program, although it can be called from other software. For a version that BUGS (BRugs) that sits within the R statistical package, see the Open Bugs site. ● WINBUGS is free and can be download from the website:

4 Introduction to WINBUGS ● Register to get full version access. ● The WINBUGS user manual and Example vol I and Vol II provide a reference for beginners and novices. ● A good way to approach a problem with WinBUGS to scan through the examples to find a problem like yours. ● When you click on examples, you will open them in what is called a compound document. This organizes programs, graphs, and explanations into a single file.

5 Example 1 Inference is required about the proportion of people in the population who are unemployed. Let us call this value “pi”. Think: The true value of pi is in the range between 0 and 1, where 0 means no one is unemployed and 1 means everyone is. Realistically, we might have some prior information on the value of pi. For instance, newspaper reports, economic theory, previous surveys, etc. Your prior can be as informative as you like. To get things started, lets assume that you choose a prior as a random value from a beta distribution restricted between the values of 0.1 and 0.6.

6 Example 1 1. Comparing running 50,000 samples with 5000 samples to see the difference: The MC error is the Monte Carlo error, it decreases as the number of samples increases. It helps in deciding when enough samples have been taken. Since we know the density is flat on top, we can reduce the wiggles by increasing the number of samples. We will get smoother density and the reduction in the MC error after increasing number of samples.

7 Example 1: 2.Rerun analysis with slightly different model: Restrict prior value range between 0.2 and 0.45 The new results are added to the Log window. Note that the x- and y-axes scales are different in the two graphs. We see that the range of possible values is constrained and that the mean value shifts to the left (is smaller). The graph indicates that we believe the true value for pi is bounded, but that within the bounds any value is equally likely.

8 Example 1: 3.Inference by combining your prior with data The real power of WinBUGS comes from the ability to combine your prior beliefs with data you have in hand, thus allowing you to make inferences. ● Continuing with our unemployment example, suppose you have results from a small sample of 14 people. A total of n = 14 people were surveyed and r = 4 of them were unemployed.These data will have a binomial distribution with proportion pi and denominator N.

9 Prior Posterior Note how the data changes our view of the unemployment rate. For one thing, the data gives us reason to think that the unemployment rate pi is less than 0.4. There is still plenty of uncertainty, but it is less than before we took the sample. The standard deviation (sd) of pi from the prior is but is from the posterior. The 95% credible interval shrinks accordingly. For more practice, and to see the effect of a larger sample on the posterior, rerun the model with n=140, and r = 40.

10 Posterior Thus as we increase our information about pi through more data, the influence of the prior on the posterior decreases. This is seen by the fact that the posterior density looks less like the original prior distribution.

11 Example 2 ● Suppose that we take another survey, perhaps at some time later. This time 12 different people were asked and 5 said they were unemployed. What is the evidence that the underlying rates for the two surveys is really different? Note: From the first sample, 4/14 or 29% of the people were not employed. From the second sample, 5/12 or 42% of the people are unemployed. Looking only at the percentages, the difference appears to be large. ● Using WinBUGS we can determine how much we know about the differences in the rates from these two small surveys by calculating a posterior for the difference. Alternatively, we can calculate a posterior for the ratio.

12 Example 3: Using WinBUGS imputing missing data ● The missing elements of the data matrix are treated as if they were unknown model parameters. An NA in the appropriate cell stands for missing value. ● User can develop a model, which contains the joint distribution of all observed and unobserved including missing data quantities ● A posterior distribution over all parameters and missing data is obtained

13 Diagnostic graphs to check covergence in WinBUGS ● Two plots can be constructed in order to investigate whether or not convergence of MCMC has been achieved. ● Time-series plot: which is plot of successive parameter estimates against the iteration number. If convergence is met then there is no trend in the plot. ● Autocorrelation function (ACF) plot, which is used to examine the relationship of successive parameter estimates. DA algorithm can be said to converge in k cycles if the autocorrelation for all parameters have converged to zero by lag K.

14 summary ● This tutorial describes the basics of using WinBUGS. ● It explains how to make directed graphs using Doodle. A directed graph provides visual representation of a statistical model. The graphs simplify complex models, communicate the structure of the problem, and provide the basis for computation. ● It explains how to compile, load data and run WinBUGS code. The code can be written automatically from the directed graph. ● It explains how to view output using a log file. The density and statistics give information about the posterior distribution that can be used for drawing inferences. ● It introduce how to deal with missing covariate problem in WinBUGS. ● The examples were easy as they relied on conjugate distributions (beta, binomial). Also, there was only one unknown parameter in example #1,two unknown parameters in example #2. WinBUGS works well for these types of problems. For more complicated problems with lots of parameters, we cannot be so sure. WinBUGS provides a set of diagnostic tools(trace and auto graphs) that allow you to check whether things are working properly. ● Things can go wrong and the help manual contains a warning. Using examples from the manual is a good path to follow, but always look critically at your results to see if they make sense. Use the diagnostic tools.