Bayes for beginners Methods for dummies 27 February 2013 Claire Berna

Slides:



Advertisements
Similar presentations
J. Daunizeau Institute of Empirical Research in Economics, Zurich, Switzerland Brain and Spine Institute, Paris, France Bayesian inference.
Advertisements

Bayesian inference Lee Harrison York Neuroimaging Centre 01 / 05 / 2009.
Bayesian inference Jean Daunizeau Wellcome Trust Centre for Neuroimaging 16 / 05 / 2008.
Wellcome Trust Centre for Neuroimaging
A Tutorial on Learning with Bayesian Networks
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for 1 Lecture Notes for E Alpaydın 2010.
Probabilistic models Jouni Tuomisto THL. Outline Deterministic models with probabilistic parameters Hierarchical Bayesian models Bayesian belief nets.
Bayesian models for fMRI data
Bayesian Inference Chris Mathys Wellcome Trust Centre for Neuroimaging UCL SPM Course London, May 12, 2014 Thanks to Jean Daunizeau and Jérémie Mattout.
Probability: Review The state of the world is described using random variables Probabilities are defined over events –Sets of world states characterized.
Bayes Rule The product rule gives us two ways to factor a joint probability: Therefore, Why is this useful? –Can get diagnostic probability P(Cavity |
Psychology 290 Special Topics Study Course: Advanced Meta-analysis April 7, 2014.
MEG/EEG Inverse problem and solutions In a Bayesian Framework EEG/MEG SPM course, Bruxelles, 2011 Jérémie Mattout Lyon Neuroscience Research Centre ? ?
Lirong Xia Hidden Markov Models Tue, March 28, 2014.
5/17/20151 Probabilistic Reasoning CIS 479/579 Bruce R. Maxim UM-Dearborn.
Part II: Graphical models
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Bayesian models for fMRI data
Bayesian models for fMRI data Methods & models for fMRI data analysis 06 May 2009 Klaas Enno Stephan Laboratory for Social and Neural Systems Research.
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
J. Daunizeau Wellcome Trust Centre for Neuroimaging, London, UK Institute of Empirical Research in Economics, Zurich, Switzerland Bayesian inference.
CSE (c) S. Tanimoto, 2008 Bayes Nets 1 Probabilistic Reasoning With Bayes’ Rule Outline: Motivation Generalizing Modus Ponens Bayes’ Rule Applying.
Bayes Nets. Bayes Nets Quick Intro Topic of much current research Models dependence/independence in probability distributions Graph based - aka “graphical.
Does Naïve Bayes always work?
Estimating the Transfer Function from Neuronal Activity to BOLD Maria Joao Rosa SPM Homecoming 2008 Wellcome Trust Centre for Neuroimaging.
Artificial Intelligence CS 165A Tuesday, November 27, 2007  Probabilistic Reasoning (Ch 14)
1 © Lecture note 3 Hypothesis Testing MAKE HYPOTHESIS ©
Graziella Quattrocchi & Louise Marshall Methods for Dummies 2014
Bayes for Beginners Presenters: Shuman ji & Nick Todd.
METHODSDUMMIES BAYES FOR BEGINNERS. Any given Monday at pm “I’m sure this makes sense, but you lost me about here…”
Methods for Dummies 2009 Bayes for Beginners Georgina Torbet & Raphael Kaplan.
Estimating parameters in a statistical model Likelihood and Maximum likelihood estimation Bayesian point estimates Maximum a posteriori point.
Bayesian Networks for Data Mining David Heckerman Microsoft Research (Data Mining and Knowledge Discovery 1, (1997))
Bayesian Inference and Posterior Probability Maps Guillaume Flandin Wellcome Department of Imaging Neuroscience, University College London, UK SPM Course,
Bayesian vs. frequentist inference frequentist: 1) Deductive hypothesis testing of Popper--ruling out alternative explanations Falsification: can prove.
DCM – the theory. Bayseian inference DCM examples Choosing the best model Group analysis.
Uncertainty Management in Rule-based Expert Systems
Uncertainty in Expert Systems
1 Chapter 12 Probabilistic Reasoning and Bayesian Belief Networks.
The famous “sprinkler” example (J. Pearl, Probabilistic Reasoning in Intelligent Systems, 1988)
Artificial Intelligence CIS 342 The College of Saint Rose David Goldschmidt, Ph.D.
INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, Lecture.
Bayesian models for fMRI data Methods & models for fMRI data analysis November 2011 With many thanks for slides & images to: FIL Methods group, particularly.
Bayes Theorem. Prior Probabilities On way to party, you ask “Has Karl already had too many beers?” Your prior probabilities are 20% yes, 80% no.
Textbook Basics of an Expert System: – “Expert systems: Design and Development,” by: John Durkin, 1994, Chapters 1-4. Uncertainty (Probability, Certainty.
- 1 - Outline Introduction to the Bayesian theory –Bayesian Probability –Bayes’ Rule –Bayesian Inference –Historical Note Coin trials example Bayes rule.
CHAPTER 3: BAYESIAN DECISION THEORY. Making Decision Under Uncertainty Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
Bayesian inference Lee Harrison York Neuroimaging Centre 23 / 10 / 2009.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Chapter 4 Probability.
Bayes for Beginners Anne-Catherine Huys M. Berk Mirza Methods for Dummies 20 th January 2016.
PROBABILITY 1. Basic Terminology 2 Probability 3  Probability is the numerical measure of the likelihood that an event will occur  The probability.
CSE (c) S. Tanimoto, 2007 Bayes Nets 1 Bayes Networks Outline: Why Bayes Nets? Review of Bayes’ Rule Combining independent items of evidence General.
Bayesian Decision Theory Introduction to Machine Learning (Chap 3), E. Alpaydin.
J. Daunizeau ICM, Paris, France TNU, Zurich, Switzerland
INTRODUCTION TO Machine Learning 2nd Edition
Does Naïve Bayes always work?
URBDP 591 A Lecture 10: Causality
Chapter 4 Probability.
Bayes for Beginners Stephanie Azzopardi & Hrvoje Stojic
INTRODUCTION TO Machine Learning
Wellcome Trust Centre for Neuroimaging
Wellcome Trust Centre for Neuroimaging
Bayes for Beginners Luca Chech and Jolanda Malamud
Bayesian inference J. Daunizeau
Wellcome Centre for Neuroimaging, UCL, UK.
INTRODUCTION TO Machine Learning
Bayesian Model Selection and Averaging
Presentation transcript:

Bayes for beginners Methods for dummies 27 February 2013 Claire Berna Lieke de Boer Methods for dummies 27 February 2013

Bayes rule Given marginal probabilities p(A), p(B), and the joint probability p(A,B), we can write the conditional probabilities: p(B|A) = p(A,B) p(A) p(A|B) = p(A,B) p(B) This is known as the product rule. p(B/A) = p(A|B) p(B) p(A) Eliminating p(A,B) gives Bayes rule:

Example: p(w|r) p(r) p(r|w) = p(w) The lawn is wet : we assume that the lawn is wet because it has rained overnight: How likely is it? p(w|r) : Likelihood p(r|w) = p(w|r) p(r) p(w) What is the probability that it has rained overnight given this observation? p(r|w): Posterior: How probable is our hypothesis given the observed evidence? P(r): Prior: Probability to rain on that day. How probable was our hypothesis before observing the evidence? p(w) : Marginal: how probable is the new evidence under all possible hypotheses?

Example: p(w|r) p(r) p(r|w) = p(w) p(w=1|r=1) p(r=1) p(r=1|w=1) = The probability p(w) is a normalisation term and can be found by marginalisation. p(w=1) = ∑ p(w=1, r) r = p(w=1,r=0) + p(w=1,r=1) = p(w=1|r=0)p(r=0) + p(w=1|r=1)p(r=1) p(w=1 | r=1) = 0.95 p(w=1 | r=0) = 0.20 p(r = 1) = 0.01 This is known as the sum rule p(r=1|w=1) = p(w=1|r=1) p(r=1) p(w=1|r=0)p(r=0) + p(w=1|r=1)p(r=1) = 0.046

Did I Leave The Sprinkler On ? A single observation with multiple potential causes (not mutually exclusive). Both rain, r , and the sprinkler, s, can cause my lawn to be wet, w. p(w, r , s) = p(r )p(s)p(w|r,s) Generative model

Did I Leave The Sprinkler On ? The probability that the sprinkler was on given i’ve seen the lawn is wet is given by Bayes rule = p(s=1|w=1) = p(w=1|s=1) p(s=1) p(w=1) p(w=1|s=1) p(s=1) p(w = 1, s = 1) + p(w = 1, s = 0) where the joint probabilities are obtained from marginalisation and from the generative model: p(w, r , s) = p(r ) p(s) p(w|r,s) p(w = 1, s = 1) = ∑1 p(w = 1, r , s = 1) = p(w=1, r=0, s=1) + p(w=1, r=1, s=1) r=0 = p(r=0) p(s=1) p(w=1|r=0, s=1) + p(r=1) p(s=1) p(w=1|r=1, s=1) p(w = 1, s = 0) = ∑1 p(w = 1,r , s = 0) = p(w=1, r=0, s=0) + p(w=1, r=1, s=0) = p(r=0) p(s=0) p(w=1|r=0, s=0) + p(r=1) p(s=0) p(w=1|r=1, s=0)

Numerical Example Bayesian models force us to be explicit about exactly what it is we believe. p(r = 1) = 0.01 p(s = 1) = 0.02 p(w = 1|r = 0, s = 0) = 0.001 p(w = 1|r = 0, s = 1) = 0.97 p(w = 1|r = 1, s = 0) = 0.90 p(w = 1|r = 1, s = 1) = 0.99 These numbers give p(s = 1|w = 1) = 0.67 p(r = 1|w = 1) = 0.31

Look next door Rain r will make my lawn wet w1 and nextdoors w2 whereas the sprinkler s only affects mine. p(w1, w2, r, s) = p(r ) p(s) p(w1|r,s) p(w2|r )

After looking next door Use Bayes rule again with joint probabilities from marginalisation p(w1 = 1, w2 = 1, s = 1) = ∑1 p(w1 = 1, w2 = 1, r , s = 1) r=0 p(w1 = 1, w2 = 1, s = 0) =∑1 p(w1 = 1;w2 = 1; r ; s = 0) p(s=1|w1=1, w2=1) = p(w1=1, w2=1, s=1) p(w1 = 1, w2 = 1, s = 1) + p(w1 = 1, w2 = 1, s = 0)

Explaining Away Numbers same as before. In addition p(w2 = 1|r = 1) = 0.90 Now we have p(s = 1|w1 = 1, w2 = 1) = 0.21 p(r = 1|w1 = 1, w2 = 1) = 0.80 The fact that my grass is wet has been explained away by the rain (and the observation of my neighbours wet lawn).

The CHILD network Probabilistic graphical model for newborn babies with congenital heart disease Decision making aid piloted at Great Ormond Street hospital (Spiegelhalter et al. 1993).

Bayesian inference in neuroimaging When comparing two models A > B ? When assessing the inactivity of a brain area P(H0)

Assessing inactivity of brain area if then reject H0 • estimate parameters (obtain test stat.) • define the null, e.g.: • apply decision rule, i.e.: classical approach if then accept H0 • invert model (obtain posterior pdf) • define the null, e.g.: • apply decision rule, i.e.: Bayesian PPM

Bayesian paradigm likelihood function GLM: y = f(θ) + ε From the assumption: noise is small Create a likelihood function with a fixed θ:

So θ needs to be fixed... priors Probability of θ, depends on: model you want to compare data previous experience Likelihood: Prior: Bayes' rule:

Bayesian inference 16 From Jean Daunizeau Precision weighting combining Bayesian inference Precision = 1/variance

forward/inverse problem From Jean Daunizeau Given my model whats the probability of pobserving data LNE axmple 17 Bayesian inference forward/inverse problem likelihood p(y|θ) p(θ|y) posterior distribution

Bayesian inference Occam's razor 18 Complicated models penalised under bayes ie both how well model fits data and how ‘simple’ it is From Jean Daunizeau Bayesian inference Occam's razor ‘The hypothesis that makes the fewest assumptions should be selected’ ‘Plurality should not be assumed without necessity’

Bayesian inference Hierarchical models hierarchy causality

References: - Will Penny’s course on Bayesian Inference, FIL, 2013 http://www.fil.ion.ucl.ac.uk/~wpenny/bayes-inf/ - J. Pearl (1988) Probabilistic reasoning in intelligent systems. San Mateo, CA. Morgan Kaufmann. Previous MfD presentations Jean Daunizeau’s SPM course a the FIL Thanks to Ged for his feedback!