Applied Bayesian Analysis for the Social Sciences Philip Pendergast Computing and Research Services Department of Sociology

Slides:



Advertisements
Similar presentations
Bayes rule, priors and maximum a posteriori
Advertisements

MCMC estimation in MlwiN
Week 11 Review: Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution.
October 1999 Statistical Methods for Computer Science Marie desJardins CMSC 601 April 9, 2012 Material adapted.
Bayesian inference “Very much lies in the posterior distribution” Bayesian definition of sufficiency: A statistic T (x 1, …, x n ) is sufficient for 
Psychology 290 Special Topics Study Course: Advanced Meta-analysis April 7, 2014.
Bayesian Estimation in MARK
A Brief Introduction to Bayesian Inference Robert Van Dine 1.
Bayesian statistics 2 More on priors plus model choice.
Flipping A Biased Coin Suppose you have a coin with an unknown bias, θ ≡ P(head). You flip the coin multiple times and observe the outcome. From observations,
EVSC 495/EVAT 795 Data Analysis & Climate Change Class hours: TuTh 2:00-3:15 pm Instructor: Michael E. Mann.
Bayesian statistics – MCMC techniques
Bayesian inference Gil McVean, Department of Statistics Monday 17 th November 2008.
Introduction  Bayesian methods are becoming very important in the cognitive sciences  Bayesian statistics is a framework for doing inference, in a principled.
CS 589 Information Risk Management 30 January 2007.
Bayesian learning finalized (with high probability)
Basics of Statistical Estimation. Learning Probabilities: Classical Approach Simplest case: Flipping a thumbtack tails heads True probability  is unknown.
The Basics of Regression continued
Presenting: Assaf Tzabari
A Two Level Monte Carlo Approach To Calculating
Introduction to Bayesian statistics Three approaches to Probability  Axiomatic Probability by definition and properties  Relative Frequency Repeated.
Thanks to Nir Friedman, HU
Bayesian Analysis for Extreme Events Pao-Shin Chu and Xin Zhao Department of Meteorology School of Ocean & Earth Science & Technology University of Hawaii-
Learning Bayesian Networks (From David Heckerman’s tutorial)
1 BA 555 Practical Business Analysis Review of Statistics Confidence Interval Estimation Hypothesis Testing Linear Regression Analysis Introduction Case.
Monté Carlo Simulation MGS 3100 – Chapter 9. Simulation Defined A computer-based model used to run experiments on a real system.  Typically done on a.
Lecture 9: p-value functions and intro to Bayesian thinking Matthew Fox Advanced Epidemiology.
Binary Variables (1) Coin flipping: heads=1, tails=0 Bernoulli Distribution.
Chapter 8 Hypothesis testing 1. ▪Along with estimation, hypothesis testing is one of the major fields of statistical inference ▪In estimation, we: –don’t.
1 Bayesian methods for parameter estimation and data assimilation with crop models Part 2: Likelihood function and prior distribution David Makowski and.
WSEAS AIKED, Cambridge, Feature Importance in Bayesian Assessment of Newborn Brain Maturity from EEG Livia Jakaite, Vitaly Schetinin and Carsten.
Statistical Decision Theory
STAT 111 Introductory Statistics Lecture 9: Inference and Estimation June 2, 2004.
Statistical Fundamentals: Using Microsoft Excel for Univariate and Bivariate Analysis Alfred P. Rovai Hypothesis Testing PowerPoint Prepared by Alfred.
St5219: Bayesian hierarchical modelling lecture 2.1.
ECE 8443 – Pattern Recognition LECTURE 07: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Class-Conditional Density The Multivariate Case General.
- 1 - Bayesian inference of binomial problem Estimating a probability from binomial data –Objective is to estimate unknown proportion (or probability of.
Three Frameworks for Statistical Analysis. Sample Design Forest, N=6 Field, N=4 Count ant nests per quadrat.
Uncertainty Management in Rule-based Expert Systems
Not in FPP Bayesian Statistics. The Frequentist paradigm Defines probability as a long-run frequency independent, identical trials Looks at parameters.
Uncertainty in Expert Systems
Statistical Decision Theory Bayes’ theorem: For discrete events For probability density functions.
Fall 2002Biostat Statistical Inference - Confidence Intervals General (1 -  ) Confidence Intervals: a random interval that will include a fixed.
Simple examples of the Bayesian approach For proportions and means.
Ch15: Decision Theory & Bayesian Inference 15.1: INTRO: We are back to some theoretical statistics: 1.Decision Theory –Make decisions in the presence of.
Bayes Theorem. Prior Probabilities On way to party, you ask “Has Karl already had too many beers?” Your prior probabilities are 20% yes, 80% no.
1 Optimizing Decisions over the Long-term in the Presence of Uncertain Response Edward Kambour.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
Course on Bayesian Methods in Environmental Valuation
Understanding Basic Statistics Fourth Edition By Brase and Brase Prepared by: Lynn Smith Gloucester County College Chapter Nine Hypothesis Testing.
Bayesian Modelling Harry R. Erwin, PhD School of Computing and Technology University of Sunderland.
LIGO- G Z August 17, 2005August 2005 LSC Meeting 1 Bayesian Statistics for Burst Search Results LIGO-T Keith Thorne and Sam Finn Penn State.
Hypothesis Testing. Statistical Inference – dealing with parameter and model uncertainty  Confidence Intervals (credible intervals)  Hypothesis Tests.
Markov-Chain-Monte-Carlo (MCMC) & The Metropolis-Hastings Algorithm P548: Intro Bayesian Stats with Psych Applications Instructor: John Miyamoto 01/19/2016:
Confidence Intervals and Hypothesis Testing Mark Dancox Public Health Intelligence Course – Day 3.
Outline Historical note about Bayes’ rule Bayesian updating for probability density functions –Salary offer estimate Coin trials example Reading material:
CSC321: Lecture 8: The Bayesian way to fit models Geoffrey Hinton.
Bayesian Estimation and Confidence Intervals Lecture XXII.
Chapter Nine Hypothesis Testing.
CHAPTER 5 Handling Uncertainty BIC 3337 EXPERT SYSTEM.
Bayesian Estimation and Confidence Intervals
MCMC Output & Metropolis-Hastings Algorithm Part I
Bayesian data analysis
Discrete Event Simulation - 4
Bayesian Inference, Basics
LECTURE 07: BAYESIAN ESTIMATION
CS 594: Empirical Methods in HCC Introduction to Bayesian Analysis
CS639: Data Management for Data Science
Bayesian Data Analysis in R
Mathematical Foundations of BME Reza Shadmehr
Presentation transcript:

Applied Bayesian Analysis for the Social Sciences Philip Pendergast Computing and Research Services Department of Sociology Sponsored by Computing and Research Services and the Institute of Behavioral Science

Suspending Disbelief-- Faith in Classical Statistics What are some issues that we have with classical statistics? Think back to your introductory class…

Suspending Disbelief-- Faith in Classical Statistics Conducting an infinite number of experiments/ repeated sampling Assume that some parameter  is unknown but has a fixed value P-value worship Null hypothesis testing Multiple comparisons Strict data assumptions, often unmet Confidence Interval interpretation Small samples are an issue

The Coin Flip Frequentist –We can determine the bias of a coin (b) by repeatedly flipping it and counting heads. As long as we repeat the process enough times, we should be able to estimate the “true” bias of the coin. –If p<.05 that b= 0.5, we reject the null hypothesis that it is unbiased.

The Nail Flip Frequentist –We determine the bias of a nail (b) by repeatedly flipping it and counting “heads” (landing on its flat base). –If p<.05 that b= 0.5, we reject the null hypothesis that it is unbiased. Does this seem reasonable? Don’t we know that the nail is biased?

Classical Statistics is Atheoretical Science is an iterative process, we should learn from past research. Theory should guide us in how we analyze data. –Typically, beyond the lit. review, informs: Variable selection Model building Choice of model (e.g. SEM, HLM) NOT the actual way parameters are estimated in the analysis

Bayesian Statistics and Theory Bayesian statistics considers  to be unknown, possessing a probability distribution reflecting our degree of uncertainty about it. We take into consideration theory and uncertainty when estimating . The Posterior: A probably distribution for  given our data on hand. The Data: Needs only meet the assumption of exchangeability. The Prior: A distribution based on knowledge about , and our certainty. p(  |y)  p(y|  )p(  )

The Nail Flip Bayesian –Prior Beliefs: We consult several nail experts, who are relatively certain that nails will land on their heads only 1/50 times, or 2% of the time. –Data on Hand: We flip the nail 100 times. –Posterior: We sample from the joint probability of our prior beliefs given our data (the Posterior distribution) to see whether the experts’ opinions are reasonable and/or if our nail shares a similar bias to other nails. Well if we examine the anatomy of the nail…

Priors and Subjectivity “B-B-B-Bbbbut wait, aren’t these priors subjective? We are objective scientists!” –Variable selection, model choice, research questions are all subjective decisions. By making these subjective decisions explicit, we open ourselves to critique and are forced to thoughtfully choose and defend our choice of priors. If we have no good theory, we must choose a prior that lets the data speak for itself.

Choosing Sensible Priors How much do we know? How accurate do we take this information to be? –Informative priors: Historical data, expert opinion, past research findings, theoretical implications. –Non-informative prior: Uniform distribution over a sensible range of values. If the prior has high precision (1/  2 ) or N is small, it will heavily influence the posterior distribution. If it has low precision or N is large, the data influences the posterior more.

Conjugate Prior Distributions Conjugate priors have a distribution that yields a posterior distribution in the same family as the prior when combined with data. Data Distribution Normal Poisson Binomial Conjugate Prior Normal or Uniform Gamma Beta

The Posterior Distribution and Monte Carlo Integration Recall that p(  |y) is a probability distribution. It is computationally demanding to directly derive summary measures of p(  |y). Instead, we repeatedly sample from p(  |y) and summarize the distribution formed by these samples. –This is called Monte Carlo Integration

Monte Carlo Markov Chains, Explained

Markov Chains, Continued We specify the number of chains as well as the number of iterations made. They “dance” around the posterior from starting values, moving to areas of higher density. Chains stabilize around the posterior mean. Once stabilized, discard early iterations (Burn-in samples). Estimates of the posterior come from the post-burn-in period.

Bayesian Analysis (Finally!) Decide on a model. Specify the # of Markov Chains, # of iterations, a burn-in period, and your prior beliefs. Run model diagnostics to check for convergence. Compare results of models with different specifications of priors, parameters, etc. to see which best “returns” the data in-hand or obtains the highest model fit (e.g. BIC, Bayes Factor, Deviance).

Overcoming Classical Shortcomings Conducting an infinite number of experiments/ repeated sampling Assume that some parameter  is unknown but has a fixed value P-value worship Null hypothesis testing Multiple comparisons Strict data assumptions, often unmet Confidence Interval Interpretation Small samples are an issue Only use data on hand, no extrapolating to other potential(ly conflicting) data Directly estimate our uncertainty of  Report HDIs, thoughtfully draw conclusions More meaningful hypothesis testing (e.g. different priors) Not an issue Minimal assumptions (exchangability) HDI shows the believability (probability) of values If strong priors, still useful

References Kruschke, J. K. (2011). Doing Bayesian Data Analysis: A tutorial with R and BUGS. Oxford: Academic Press. Kaplan, D. (2014). Bayesian Statistics for the Social Sciences. New York: Guilford Press.

R “MCMCpack” Tutorial Run simple models predicting job satisfaction as a function of income. One model uses an uninformative prior (specifically, the uniform distribution) The other uses an informed prior from earlier data Compare the Bayes Factors to see which “retrieves” the data better (i.e. is a better fit)

R “MCMCpack” Tutorial Open R Click “Packages”-->Set CRAN mirror--> Pick anything in the US. Open “Packages” again-->Install Packages-->Scroll down to MCMCpack. Say “yes” to a new library. Type “library(MCMCpack)” to load it in, also type “library(foreign)” to enable reading of the STATA file.