Presentation is loading. Please wait.

Presentation is loading. Please wait.

Bayesian Statistics on a Shoestring Assaf Oron, May 2008

Similar presentations


Presentation on theme: "Bayesian Statistics on a Shoestring Assaf Oron, May 2008"— Presentation transcript:

1 Bayesian Statistics on a Shoestring Assaf Oron, May 2008
Stat 391 – Lecture 12 Bayesian Statistics on a Shoestring Assaf Oron, May 2008

2 Bayes’ Rule – and “Bayesians”
Bayes lived and proved his rule a long time ago The rule, and the updating principle associated with it, belong to all branches of statistics The term “Bayesian statistics” is modern. Depending upon whom you ask, it may represent: A perspective and toolset, which are useful for many tasks; The only way to do statistics intelligently; …An irrational cult! (it’s somewhat of a generational gap right now) … I will try to present Bayesian statistics via the description marked in blue above

3 The Basic Principle Recall the trick we did a few weeks ago, calling the density “likelihood” and viewing it as a function of the fixed parameters Recall also more recently, the awkward jargon used to describe confidence intervals These somewhat inelegant fixes can be traced down to an asymmetry: The data are modeled as following some probability distribution The parameters are modeled as fixed, though usually unknown What if we decided that the parameters are random, too?...

4 The Basic Principle (2) Let’s view the data as an r.v. called X
Parameters are, of course, θ; Write down Bayes’ rule, using densities: This is the ‘regular’ (“frequentist”) likelihood of the data given fixed parameter values This is the ‘prior’ density of the parameters (based on previous knowledge, usually unrelated to the current data) This marginal probability of the data over all possible parameter configurations, is not a function of θ and is irrelevant for estimation

5 The Basic Principle (3) …the Bayesian way of writing Bayes’ rule is usually this: The posterior distribution of the parameters, based on the data The prior distribution of the parameters, before the data (Since we omitted the marginal probability of the data, the equation becomes a proportionality; we don’t care, since we know the LHS is a density we can “find” the missing factor automatically by normalizing the integral of the LHS to 1)

6 Bayesian Estimation Bayesian estimation is based primarily on probability calculations from the posterior, The most common Bayesian point estimates are the posterior mean (i.e., E[θ|x]), median or mode These can be framed as solutions to different loss-minimization problems

7 A Brief History of Bayesianism
The Bayesian idea has been around for while, but sat mostly on the shelf for practical reasons: If you take any two arbitrary distributions for data and prior, you will end up with an intractably complicated posterior (for each “common” data distribution, there exists at least one type of prior that fits it well; it is known as the “conjugate prior”) With the advent of computing, a statistical-simulation technology known as MCMC (“Markov Chain Monte Carlo”) has made (nearly) any combination of distributions possible to compute, sometimes instantly

8 Conjugate Prior Hands-on
The conjugate prior for the Binomial is the Beta That is: X ~ Binomial(n,p) and p ~ Beta(α,β) should match nicely Write out the kernel of the posterior (i.e., the essential form – only terms with x or p in them): Simplify this a bit further; can you recognize the form of the posterior?

9 Advantages of Bayesian Methods
A symmetry that is conceptually attractive Can incorporate prior content information (from scientists, etc.) that should play a role in evaluation of the data Hypothesis tests, model selection, confidence intervals become easier Risk of wrong model (=“model misspecification”) can be reduced More complete information about parameters

10 Advantages of Bayesian Methods (2)
Avoids some of the counter-intuitive side-effects of MLE calculations Ability to fit complicated models, estimate complicated parameters, accommodate for errors in “fixed” values In many cases, a random interpretation fits the parameters more than a fixed one: Opinion polls and human behavior Ecology, Demographics (coming to think about it, natural populations are never really fixed)

11 Drawbacks of Bayesian Methods
Symmetry? Not really “It’s Tortoises all the way down”: the prior needs parameters too… and they better be fixed, or else; which is exactly the problem The prior affects our estimation, whether or not it is really based on expert knowledge A workaround known as “flat” or “improper” priors, has made things worse in many ways: if you use them, you may find yourself not having a posterior distribution at all

12 Drawbacks of Bayesian Methods (2)
Choice of prior form and details – adds yet another arbitrary element to the tenuous connection between model and reality MCMC simulations have a lot of “moving parts” and are not trivial to diagnose for problems Socially, the approach has “hype”, and dogmatic “group-think” overtones that are not helpful In many cases, a random interpretation is not appropriate


Download ppt "Bayesian Statistics on a Shoestring Assaf Oron, May 2008"

Similar presentations


Ads by Google