Bayesian Statistics Simon French

Slides:



Advertisements
Similar presentations
Bayes rule, priors and maximum a posteriori
Advertisements

1 Bayesian methods for parameter estimation and data assimilation with crop models David Makowski and Daniel Wallach INRA, France September 2006.
Week 11 Review: Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution.
CHAPTER 8 More About Estimation. 8.1 Bayesian Estimation In this chapter we introduce the concepts related to estimation and begin this by considering.
Estimation, Variation and Uncertainty Simon French
Bayesian inference “Very much lies in the posterior distribution” Bayesian definition of sufficiency: A statistic T (x 1, …, x n ) is sufficient for 
Psychology 290 Special Topics Study Course: Advanced Meta-analysis April 7, 2014.
A Brief Introduction to Bayesian Inference Robert Van Dine 1.
Chapter 7 Title and Outline 1 7 Sampling Distributions and Point Estimation of Parameters 7-1 Point Estimation 7-2 Sampling Distributions and the Central.
Bayesian statistics – MCMC techniques
Bayesian inference Gil McVean, Department of Statistics Monday 17 th November 2008.
Maximum likelihood (ML) and likelihood ratio (LR) test
Bayesian estimation Bayes’s theorem: prior, likelihood, posterior
0 Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Maximum likelihood (ML) and likelihood ratio (LR) test
Basics of Statistical Estimation. Learning Probabilities: Classical Approach Simplest case: Flipping a thumbtack tails heads True probability  is unknown.
Course overview Tuesday lecture –Those not presenting turn in short review of a paper using the method being discussed Thursday computer lab –Turn in short.
Results 2 (cont’d) c) Long term observational data on the duration of effective response Observational data on n=50 has EVSI = £867 d) Collect data on.
Presenting: Assaf Tzabari
A Two Level Monte Carlo Approach To Calculating
Statistical Background
Introduction to Bayesian statistics Three approaches to Probability  Axiomatic Probability by definition and properties  Relative Frequency Repeated.
Using ranking and DCE data to value health states on the QALY scale using conventional and Bayesian methods Theresa Cain.
Applied Bayesian Analysis for the Social Sciences Philip Pendergast Computing and Research Services Department of Sociology
Learning Bayesian Networks (From David Heckerman’s tutorial)
Maximum likelihood (ML)
Lecture 9: p-value functions and intro to Bayesian thinking Matthew Fox Advanced Epidemiology.
Overview G. Jogesh Babu. Probability theory Probability is all about flip of a coin Conditional probability & Bayes theorem (Bayesian analysis) Expectation,
ECE 8443 – Pattern Recognition LECTURE 06: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Bias in ML Estimates Bayesian Estimation Example Resources:
Statistical Decision Theory
Introduction to MCMC and BUGS. Computational problems More parameters -> even more parameter combinations Exact computation and grid approximation become.
Random Sampling, Point Estimation and Maximum Likelihood.
A Beginner’s Guide to Bayesian Modelling Peter England, PhD EMB GIRO 2002.
A statistical model Μ is a set of distributions (or regression functions), e.g., all uni-modal, smooth distributions. Μ is called a parametric model if.
Estimating parameters in a statistical model Likelihood and Maximum likelihood estimation Bayesian point estimates Maximum a posteriori point.
St5219: Bayesian hierarchical modelling lecture 2.1.
7-1 Introduction The field of statistical inference consists of those methods used to make decisions or to draw conclusions about a population. These.
ECE 8443 – Pattern Recognition LECTURE 07: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Class-Conditional Density The Multivariate Case General.
G. Cowan Lectures on Statistical Data Analysis Lecture 1 page 1 Lectures on Statistical Data Analysis London Postgraduate Lectures on Particle Physics;
Maximum Likelihood - "Frequentist" inference x 1,x 2,....,x n ~ iid N( ,  2 ) Joint pdf for the whole random sample Maximum likelihood estimates.
Bayesian vs. frequentist inference frequentist: 1) Deductive hypothesis testing of Popper--ruling out alternative explanations Falsification: can prove.
- 1 - Bayesian inference of binomial problem Estimating a probability from binomial data –Objective is to estimate unknown proportion (or probability of.
Chapter 2 Statistical Background. 2.3 Random Variables and Probability Distributions A variable X is said to be a random variable (rv) if for every real.
Statistical Decision Theory Bayes’ theorem: For discrete events For probability density functions.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 07: BAYESIAN ESTIMATION (Cont.) Objectives:
The generalization of Bayes for continuous densities is that we have some density f(y|  ) where y and  are vectors of data and parameters with  being.
Markov Chain Monte Carlo for LDA C. Andrieu, N. D. Freitas, and A. Doucet, An Introduction to MCMC for Machine Learning, R. M. Neal, Probabilistic.
Bayes Theorem. Prior Probabilities On way to party, you ask “Has Karl already had too many beers?” Your prior probabilities are 20% yes, 80% no.
MPS/MSc in StatisticsAdaptive & Bayesian - Lect 71 Lecture 7 Bayesian methods: a refresher 7.1 Principles of the Bayesian approach 7.2 The beta distribution.
Course on Bayesian Methods in Environmental Valuation
1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.
The Unscented Particle Filter 2000/09/29 이 시은. Introduction Filtering –estimate the states(parameters or hidden variable) as a set of observations becomes.
Stats Term Test 4 Solutions. c) d) An alternative solution is to use the probability mass function and.
Statistical Methods. 2 Concepts and Notations Sample unit – the basic landscape unit at which we wish to establish the presence/absence of the species.
Parameter Estimation. Statistics Probability specified inferred Steam engine pump “prediction” “estimation”
Outline Historical note about Bayes’ rule Bayesian updating for probability density functions –Salary offer estimate Coin trials example Reading material:
Ch 1. Introduction Pattern Recognition and Machine Learning, C. M. Bishop, Updated by J.-H. Eom (2 nd round revision) Summarized by K.-I.
1 Life Cycle Assessment A product-oriented method for sustainability analysis UNEP LCA Training Kit Module k – Uncertainty in LCA.
Canadian Bioinformatics Workshops
Bayesian Estimation and Confidence Intervals Lecture XXII.
MCMC Output & Metropolis-Hastings Algorithm Part I
Bayesian estimation Bayes’s theorem: prior, likelihood, posterior
Bayesian data analysis
Introduction to the bayes Prefix in Stata 15
Bayes Net Learning: Bayesian Approaches
More about Posterior Distributions
Wellcome Trust Centre for Neuroimaging
LECTURE 07: BAYESIAN ESTIMATION
Bayes for Beginners Luca Chech and Jolanda Malamud
Mathematical Foundations of BME Reza Shadmehr
Presentation transcript:

Bayesian Statistics Simon French

The usual view of statistics What does the data – and only the data –tell us in relation to the research questions of interest? By focusing on the data alone, we are ‘clearly’ being objective….

But … … classical/frequentist tatistical methods contain hidden subjective choices …. Why choose 1% or 5% as significance levels? Why choose a minimum variance unbiased estimate rather than a maximum likelihood estimator which might be biased but lead to tighter bounds? ….

The Bayesian paradigm … … is explicitly subjective. It models judgements and explores their implications –probabilities to represent beliefs and uncertainties –(and utilities to represent values and costs so that inferences lead transparently to decisions) is based upon a model of an idealised (consistent, rational) scientist focuses first on the individual scientist; then by varying the scientist’s beliefs enables the exploration of potential consensus. For a Bayesian, knowledge is based on consensus

The Bayesian view of statistics What are we uncertain about and how does the data reduce that uncertainty? not What does the data – and only the data –tell us in relation to the research questions of interest?

Rev. Thomas Bayes 1701?-1761 Main work published posthumously: T. Bayes (1763) An essay towards solving a problem in the doctrine of chances. Phil Trans Roy. Soc Bayes Theorem – inverse probability

Bayes theorem Posterior probability  likelihood  prior probability p(  | x)  p(x |  ) × p(  )

Bayes theorem Posterior probability  likelihood  prior probability p(  | x)  p(x |  ) × p(  ) Our knowledge before the experiment Probability distribution of parameters p(  )

Bayes theorem Posterior probability  likelihood  prior probability p(  | x)  p(x |  ) × p(  ) Our knowledge of the design of the experiment or survey and the actual data likelihood of data given parameters p(x|  )

Bayes theorem Posterior probability  likelihood  prior probability p(  | x)  p(x |  ) × p(  ) Our knowledge after the experiment Probability distribution of parameters given data p(  |x)

Bayes theorem Posterior probability  likelihood  prior probability p(  | x)  p(x |  ) × p(  ) There is a constant, but ‘easy’ to find as probability adds (integrates) to one

Medical Test Probability of having disease = –i.e. 1 in 1000 –Probability of not having disease = Test has 95% of detecting disease if present; but 2% of falsely detecting it if absent –False negative rate= 5% False positive rate= 2% Disease Test GeNIe Software:

Simple Bayes Normal Model:

Prior Posterior Toss a biased coin 12 times; obtain 9 heads Bayes Theorem as applied to Statistics

15 Prior Posterior Toss a biased coin 12 times; obtain 9 heads Bayesian Estimation Take mean, median or mode

Prior Posterior Toss a biased coin 12 times; obtain 9 heads Bayesian confidence interval Highest 95% density

Prior Posterior Toss a biased coin 12 times; obtain 9 heads Bayesian hypothesis test To test H 0 :  1 > 0.6 look at Prob (  1 >0.6)

But why do any of these? Just report the posterior. It encodes all that is known about  1

Bayesian decision analysis Decision? Science Model uncertainties with probabilities Values Model preferences with multi-attribute utilities Data Observe data X = x from p X (· |  ) feedback to future decisions Bayes Theorem Combine  Advice Statistics Decision and Risk Analysis

Bayes Calculations Analytic approaches –conjugate families of distributions –Kalman filters Numerical integration –Quadrature –Asymptotic expansions Markov Chain Monte Carlo (MCMC) –Gibbs Sampling, Particle filters –Almost any distributions and models

Modelling uncertainty Might be better to say Bayesians practice uncertainty modelling There are simple modelling strategies and tools for this –hierarchical modelling –belief nets –….

Bayes theorem In real problems, x and  are multi-dimensional –with ‘big data’, very high dimensional Can we restructure p(x,  ) to be easier to work with? –e.g. to draw in and use independence structures, etc. p(  | x)  p(x |  ) × p(  ) =p(x,  )

Hierarchical Models Simple Bayes Normal Model: Three Stage Bayes Normal Model:  11 22 nn …. X1X1 X2X2 XnXn

The Asia Belief Net Visit to Asia? Smoking? Tuberculosis Lung Cancer Bronchitis X-Ray Result? Dyspnea?

Subjectivity vs Objectivity Bayesian statistics is explicitly subjective Science is (thought to be) objective  controversy! 25

26 Importance of prior Different priors lead to different conclusions  subjective  not scientific? Can use: –ignorant (vague, non-informative) prior to ‘let data speak for themselves’ –precise prior to capture agreed common knowledge –Sensitivity analysis to explore the importance of the priors Indeed can use sensitivity analysis to explore agreements and disagreements on many aspects of the model not just the prior If Science is about a consensus on knowledge, then exploring a range of priors helps establish precisely that

All analysis assumes a model … Another subjective choice and one not often address in any discussion of methodology –same is true in classical/frequentist statistics Bayesian analysis provides an assessment of uncertainties in the context of the assumed model –same is true in classical/frequentist statistics: e.g. p values Real world uncertainty includes these but more that arise from the fact the model is not the real world

BUGS Software Bayesian inference Using Gibbs Sampling –Lunn, D.J., Thomas, A., Best, N., and Spiegelhalter, D. (2000) WinBUGS -- a Bayesian modelling framework: concepts, structure, and extensibility. Statistics and Computing, 10:325−337 –Lunn, D. J., Jackson, C., Best, N., Thomas, A. and Spiegelhalter, D. (2013). The BUGS Book: a Practical Introduction to Bayesian Analysis. London, Chapman and Hall. – 28

29 Reading W.M. Bolstad (2007). Introduction to Bayesian Statistics. 2 nd E dn, Hoboken, NJ, John Wiley and Sons. P. M. Lee (2012). Bayesian Statistics: An Introduction. 4 th E dn, Chichester, John Wiley and Sons. R. Christensen, W. Johnson, A. Branscum and T.E. Hanson (2011) Bayesian Ideas and Data Analysis. Boca Raton, CRC/Chapman and Hall P. Congdon (2001) Bayesian Statistical Modelling. Chichester, John Wiley and Sons S. French and D. Rios Insua (2000). Statistical Decision Theory. London, Arnold. A. O'Hagan and J. Forester (2004). Bayesian Statistics. London, Edward Arnold. J.M. Bernardo and A.F.M. Smith (1994). Bayesian Theory. Chichester, John Wiley and Sons.

ISBA International Society for Bayesian Analysis Many resources and guide to software, literature, etc. Newsletter Open journal: Bayesian Analysis 30

Thank you