“Users of scooters must transfer to a fixed seat” (Notice in Brisbane Taxi)

Slides:



Advertisements
Similar presentations
Part 2: Unsupervised Learning
Advertisements

Introduction to Monte Carlo Markov chain (MCMC) methods
Other MCMC features in MLwiN and the MLwiN->WinBUGS interface
MCMC estimation in MlwiN
Probabilistic models Jouni Tuomisto THL. Outline Deterministic models with probabilistic parameters Hierarchical Bayesian models Bayesian belief nets.
Applied Bayesian Inference for Agricultural Statisticians Robert J. Tempelman Department of Animal Science Michigan State University 1.
Markov Chain Monte Carlo Convergence Diagnostics: A Comparative Review By Mary Kathryn Cowles and Bradley P. Carlin Presented by Yuting Qi 12/01/2006.
Comments on Hierarchical models, and the need for Bayes Peter Green, University of Bristol, UK IWSM, Chania, July 2002.
Bayesian Estimation in MARK
Gibbs Sampling Qianji Zheng Oct. 5th, 2010.
Bayesian Methods with Monte Carlo Markov Chains III
Introduction to Sampling based inference and MCMC Ata Kaban School of Computer Science The University of Birmingham.
CHAPTER 16 MARKOV CHAIN MONTE CARLO
Bayesian statistics – MCMC techniques
BAYESIAN INFERENCE Sampling techniques
Maximum Likelihood. Likelihood The likelihood is the probability of the data given the model.
. PGM: Tirgul 8 Markov Chains. Stochastic Sampling  In previous class, we examined methods that use independent samples to estimate P(X = x |e ) Problem:
Computational statistics 2009 Random walk. Computational statistics 2009 Random walk with absorbing barrier.
The University of Texas at Austin, CS 395T, Spring 2008, Prof. William H. Press IMPRS Summer School 2009, Prof. William H. Press 1 4th IMPRS Astronomy.
End of Chapter 8 Neil Weisenfeld March 28, 2005.
Using ranking and DCE data to value health states on the QALY scale using conventional and Bayesian methods Theresa Cain.
Computer vision: models, learning and inference Chapter 10 Graphical Models.
Bayesian Analysis for Extreme Events Pao-Shin Chu and Xin Zhao Department of Meteorology School of Ocean & Earth Science & Technology University of Hawaii-
G. Cowan Lectures on Statistical Data Analysis Lecture 10 page 1 Statistical Data Analysis: Lecture 10 1Probability, Bayes’ theorem 2Random variables and.
Bayes Factor Based on Han and Carlin (2001, JASA).
Modeling Menstrual Cycle Length in Pre- and Peri-Menopausal Women Michael Elliott Xiaobi Huang Sioban Harlow University of Michigan School of Public Health.
Overview G. Jogesh Babu. Probability theory Probability is all about flip of a coin Conditional probability & Bayes theorem (Bayesian analysis) Expectation,
Model Inference and Averaging
Introduction to MCMC and BUGS. Computational problems More parameters -> even more parameter combinations Exact computation and grid approximation become.
A Beginner’s Guide to Bayesian Modelling Peter England, PhD EMB GIRO 2002.
2 nd Order CFA Byrne Chapter 5. 2 nd Order Models The idea of a 2 nd order model (sometimes called a bi-factor model) is: – You have some latent variables.
Stochastic Linear Programming by Series of Monte-Carlo Estimators Leonidas SAKALAUSKAS Institute of Mathematics&Informatics Vilnius, Lithuania
1 Gil McVean Tuesday 24 th February 2009 Markov Chain Monte Carlo.
Multilevel Modeling Software Wayne Osgood Crime, Law & Justice Program Department of Sociology.
Example: Bioassay experiment Problem statement –Observations: At each level of dose, 5 animals are tested, and number of death are observed.
Latent Class Regression Model Graphical Diagnostics Using an MCMC Estimation Procedure Elizabeth S. Garrett Scott L. Zeger Johns Hopkins University
Bayesian Hierarchical Modeling for Longitudinal Frequency Data Joseph Jordan Advisor: John C. Kern II Department of Mathematics and Computer Science Duquesne.
Lecture 2: Statistical learning primer for biologists
The generalization of Bayes for continuous densities is that we have some density f(y|  ) where y and  are vectors of data and parameters with  being.
Overview G. Jogesh Babu. Overview of Astrostatistics A brief description of modern astronomy & astrophysics. Many statistical concepts have their roots.
And now…. …for something completely different (Monty Python)
Probabilistic models Jouni Tuomisto THL. Outline Deterministic models with probabilistic parameters Hierarchical Bayesian models Bayesian belief nets.
1 Introduction to Statistics − Day 4 Glen Cowan Lecture 1 Probability Random variables, probability densities, etc. Lecture 2 Brief catalogue of probability.
A shared random effects transition model for longitudinal count data with informative missingness Jinhui Li Joint work with Yingnian Wu, Xiaowei Yang.
Item Parameter Estimation: Does WinBUGS Do Better Than BILOG-MG?
Tutorial I: Missing Value Analysis
1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.
CS Statistical Machine learning Lecture 25 Yuan (Alan) Qi Purdue CS Nov
Bayesian Modelling Harry R. Erwin, PhD School of Computing and Technology University of Sunderland.
1 Getting started with WinBUGS Mei LU Graduate Research Assistant Dept. of Epidemiology, MD Anderson Cancer Center Some material was taken from James and.
Introduction to emulators Tony O’Hagan University of Sheffield.
Overview G. Jogesh Babu. R Programming environment Introduction to R programming language R is an integrated suite of software facilities for data manipulation,
Hierarchical Models. Conceptual: What are we talking about? – What makes a statistical model hierarchical? – How does that fit into population analysis?
Generalization Performance of Exchange Monte Carlo Method for Normal Mixture Models Kenji Nagata, Sumio Watanabe Tokyo Institute of Technology.
Markov Chain Monte Carlo in R
Introduction to Sampling based inference and MCMC
MCMC Output & Metropolis-Hastings Algorithm Part I
“Bayesian Inference Using Gibbs Sampling”
Advanced Statistical Computing Fall 2016
Introducing Bayesian Approaches to Twin Data Analysis
Jun Liu Department of Statistics Stanford University
Remember that our objective is for some density f(y|) for observations where y and  are vectors of data and parameters,  being sampled from a prior.
Markov Networks.
Predictive distributions
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Ch13 Empirical Methods.
EM for Inference in MV Data
EM for Inference in MV Data
Markov Networks.
Bayesian Networks in Educational Assessment
Presentation transcript:

“Users of scooters must transfer to a fixed seat” (Notice in Brisbane Taxi)

Introduction to BUGS in Genetic Epidemiology “Bayesian Inference Using Gibbs Sampling” Lindon Eaves, Tim York Leuven, August 2008.

Government Health Warning From: WinBUGS manual Page 1

Goals Give overview of MCMC – what is it and why use it. Give some simple examples of building blocks of applications: means, variances, twin correlations. Show how building blocks can be developed for more complex (“interesting”, “challenging”) problems that give a flavor of what MCMC might be used for.

Apology I don’t know much and probably don’t know what I am talking about. But I hope others will see possibilities and help deepen understanding.

Thanks Allattin Erkanli Nick Martin Staff and students QIMR

Critical Source: MRC BUGS Project bsu.cam.ac.uk/bugs/

Why bother? Intellectual: Different (“Bayesian”) way of thinking about statistics and statistical modeling. Pragmatic:“Model Liberation” – can do a bunch of cool new stuff much more easily than within ML framework. Pragmatic: Learn more about data for less computational effort Pragmatic: “Fast”

Payoff Estimate parameters of complex models Obtain subject parameters (e.g. “genetic and environmental factor scores”) at no extra-cost Obtain confidence intervals and other summary statistics (s.e’s, quantiles etc) at no extra cost. Automatic imputation of missing data (“data augmentation”) Fast (35 item, IRT in 500 twins with covariates takes about 1-2 hours on laptop). Insight, flexibility

Generally: Seems to help with models that require multi-dimensional integration to compute likelihood.

Some applications Non-linear latent variables (GxE interaction). Multidimensional, multi-category, multi-item IRT in twins. Genetic effects on developmental change in multiple indicators of puberty (latent growth curves). Hierarchical mixed models for fixed and random effects of G, E and GxE in multi-symptom (“IRT”) twin data. Genetic survival models Mixture models. Estimating (genetic?) factor scores – (?genetic) counseling, c.f.. estimation of breeding values Imputation of missing values Model selection (“Reversible Jump” MCMC) Non-normal outcomes (e.g. symptom counts) Awkward designs: incomplete randomization, batches

This introduction Introduce ideas Practice use of WinBUGS Run some basic examples Look at application to genetic IRT Other stuff?

Some resources: Gilks WR, Richardson S, Spiegelhalter DJ (1996) Markov Chain Monte Carlo in Practice. Boca Raton, Chapman & Hall, Gelman A, Carlin JB, Stern HS, Rubin DB. (2004) Bayesian Data Analysis (2 nd Ed,) Boca Raton, Chapman & Hall. Spiegelhalter DJ, Thomas A, Best N, Lunn D. (2004). WinBUGS User Manual Version Cambridge, England. MRC BUGS project. [Downloaded with WinBUGS – also Examples Vols. I and II] Maris, G and Bechger, T.M. (2005). An Introduction to the DA-T Gibbs Sampler for the Two-Parameter Logistic (2PL) Model and Beyond. Psicol´ogica: 26, Applications of reversible jump MCMC. Marko Laine. University of Helsinki Department of Mathematics. Lunn, D.J.; Whittaker, J.C.; Best, N.; A Bayesian toolkit for genetic association studies. Genet Epidemiol, 2006; 30(3):231-47

Basic Ideas Bayesian Estimation (vs. ML) “Monte Carlo” “Markov Chain” Gibbs sampler

(Maximum) Likelihood Compute (log-) likelihood of getting data given values of model parameters and assumed distribution Search for parameter values that maximize likelihood (“ML” estimates) Compare models by comparing likelihoods Obtain confidence intervals by contour plots (i.e. repeated ML conditional on selected parameters) Obtain s.e.’s by differentiating L

Problem with ML Many models require integration over values of latent variables (e.g. non-linear random effects) Integrate to evaluate each likelihood and derivatives for each parameter “Expensive” when number of dimensions is large (?days), especially for confidence intervals.

Maximum Likelihood (ML) “Thinks” (theoretically) about parameters and data separately: P(data|parameters) “Thinks” (practically) of integration, searching and finding confidence intervals as separate numerical problems (quadrature, e.g. Newton- Raphson, numerical differentiation).

Markov Chain Monte Carlo (MCMC, MC 2 ) “Thinks” (theoretically) that there is no difference between parameters and data – seeks distribution of parameters given data – P(parameters|data) {Bayesian estimation} “Thinks” (practically) that integration, search and interval estimation constitute a single process addressed by a single unifying algorithm {Gibbs Sampling}

“Parameter” Anything that isn’t data: means, components of variance, correlations. But also subjects’ scores (not just distributions) on latent traits (genetic liabilities, factor scores), missing data points.

Basic approach “Bayesian” = Considers joint distribution of parameters and data “Monte Carlo” = Simulation “Markov Chain” = Sequence of simulations (“time series”) designed to converge on samples from posterior distribution of  given D “Gibbs sampler” = method of conducting simulations – cycles through all parameters simulating new value of parameter conditional on D and every other parameter

“Bayesian” Considers joint distribution of all parameters (  and data (D): P( .D) Seeks “posterior distribution” of  given D: P(  |D) Need to know “prior” distribution P(  ), but don’t. Start out by assuming some prior distribution (“uniformative” priors – i.e. encompassing wide range of possible parameter values) and seek to refine using data.

“Monte Carlo” Computer simulation of unknown parameters (“nodes”) from assumed distribution (“computer intensive”). If distribution is assumed [e.g. mean and variance] then successive simulations represent samples from the assumed distribution i.e. Can estimate “true” distribution from large number of (simulated) samples – can get any properties of distribution (means, s.d.s, quantiles) to any desired degree of precision.

“Markov Chain” “True” prior distribution P(  ) unknown. “Markov Chain” – series of outcomes (e.g. sets of data points) each contingent on previous outcome – under certain conditions reach a sequence where underlying distribution does not change (“stationary distribution”). Start with assumed prior distribution and construct (simulate) Markov Chain for given D that converges to samples from posterior distribution: P(  |D). Then use (large enough) set of samples from stationary distribution to characterize properties of desired posterior distribution.

“Gibbs Sampler” Algorithm for generating Markov chains from multiple parameters conditional on D. Takes each parameter in turn and generates new value conditional on the data and every other parameter. Cycle through all parameters (“one iteration”) and repeat until converge to stationary distribution.

Idea of Gibbs Sampler Suppose want to draw samples (x,y) from bivariate normal: Could do it by sampling x from univariate normal and then sample y from univariate normal of y|x, then new x|y…. Sucessive samples of (x,y) are from bivariate normal

WinBUGS Free Simple language (“R-like”) [“Open” Version: “Open BUGS”] PC version (BUGS also available for mainframe Graphical interface (OK but usually easier to write or adapt existing code) Well documented, good examples

Problems Convergence criteria (“mixing”) Model comparison ? Sensitivity to priors Model identification Error messages sometimes obscure Data set-up can be a pain Can have problems (Latent class analysis)

Bottom line If you can figure how you would simulate it, you can probably “BUGS-it.” Need to be clear and explicit about model and assumptions.

Today Tour BUGS Run some simple examples of easy problems Illustrate application to IRT

“Ladies and gentlemen… start your engines…”