CS 594: Empirical Methods in HCC Introduction to Bayesian Analysis

Slides:



Advertisements
Similar presentations
Bayes rule, priors and maximum a posteriori
Advertisements

Probabilistic models Haixu Tang School of Informatics.
Week 11 Review: Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution.
October 1999 Statistical Methods for Computer Science Marie desJardins CMSC 601 April 9, 2012 Material adapted.
Probability Distributions CSLU 2850.Lo1 Spring 2008 Cameron McInally Fordham University May contain work from the Creative Commons.
Bayesian inference “Very much lies in the posterior distribution” Bayesian definition of sufficiency: A statistic T (x 1, …, x n ) is sufficient for 
Psychology 290 Special Topics Study Course: Advanced Meta-analysis April 7, 2014.
Chapter 7 Title and Outline 1 7 Sampling Distributions and Point Estimation of Parameters 7-1 Point Estimation 7-2 Sampling Distributions and the Central.
Flipping A Biased Coin Suppose you have a coin with an unknown bias, θ ≡ P(head). You flip the coin multiple times and observe the outcome. From observations,
1 Methods of Experimental Particle Physics Alexei Safonov Lecture #21.
.. . Parameter Estimation using likelihood functions Tutorial #1 This class has been cut and slightly edited from Nir Friedman’s full course of 12 lectures.
CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 10: The Bayesian way to fit models Geoffrey Hinton.
Physics 270 – Experimental Physics. Let say we are given a functional relationship between several measured variables Q(x, y, …) What is the uncertainty.
Parameter Estimation using likelihood functions Tutorial #1
Bayesian inference Gil McVean, Department of Statistics Monday 17 th November 2008.
Descriptive statistics Experiment  Data  Sample Statistics Sample mean Sample variance Normalize sample variance by N-1 Standard deviation goes as square-root.
This presentation has been cut and slightly edited from Nir Friedman’s full course of 12 lectures which is available at Changes.
Basics of Statistical Estimation. Learning Probabilities: Classical Approach Simplest case: Flipping a thumbtack tails heads True probability  is unknown.
Descriptive statistics Experiment  Data  Sample Statistics Experiment  Data  Sample Statistics Sample mean Sample mean Sample variance Sample variance.
Machine Learning CMPT 726 Simon Fraser University
. PGM: Tirgul 10 Parameter Learning and Priors. 2 Why learning? Knowledge acquisition bottleneck u Knowledge acquisition is an expensive process u Often.
Introduction to Bayesian statistics Three approaches to Probability  Axiomatic Probability by definition and properties  Relative Frequency Repeated.
Thanks to Nir Friedman, HU
Standard error of estimate & Confidence interval.
INFERENTIAL STATISTICS – Samples are only estimates of the population – Sample statistics will be slightly off from the true values of its population’s.
Problem A newly married couple plans to have four children and would like to have three girls and a boy. What are the chances (probability) their desire.
Tests of significance & hypothesis testing Dr. Omar Al Jadaan Assistant Professor – Computer Science & Mathematics.
Additional Slides on Bayesian Statistics for STA 101 Prof. Jerry Reiter Fall 2008.
“While nothing is more uncertain than a single life, nothing is more certain than the average duration of a thousand lives” -Elizur Wright By: Gail Larsen.
Statistical Decision Theory
Topics: Statistics & Experimental Design The Human Visual System Color Science Light Sources: Radiometry/Photometry Geometric Optics Tone-transfer Function.
Dr. Gary Blau, Sean HanMonday, Aug 13, 2007 Statistical Design of Experiments SECTION I Probability Theory Review.
Theory of Probability Statistics for Business and Economics.
4.1 Probability Distributions. Do you remember? Relative Frequency Histogram.
BINOMIALDISTRIBUTION AND ITS APPLICATION. Binomial Distribution  The binomial probability density function –f(x) = n C x p x q n-x for x=0,1,2,3…,n for.
Bayesian statistics Probabilities for everything.
Physics 270 – Experimental Physics. Let say we are given a functional relationship between several measured variables Q(x, y, …) x ±  x and x ±  y What.
Week 41 Estimation – Posterior mean An alternative estimate to the posterior mode is the posterior mean. It is given by E(θ | s), whenever it exists. This.
Statistical Decision Theory Bayes’ theorem: For discrete events For probability density functions.
Ch15: Decision Theory & Bayesian Inference 15.1: INTRO: We are back to some theoretical statistics: 1.Decision Theory –Make decisions in the presence of.
Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Lecture 24 of 41 Monday, 18 October.
CHAPTER 3: BAYESIAN DECISION THEORY. Making Decision Under Uncertainty Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
Parameter Estimation. Statistics Probability specified inferred Steam engine pump “prediction” “estimation”
CSC321: Lecture 8: The Bayesian way to fit models Geoffrey Hinton.
Theoretical distributions: the other distributions.
Bayesian Estimation and Confidence Intervals Lecture XXII.
Fundamentals of Probability
Oliver Schulte Machine Learning 726
Statistical Inference
Bayesian Estimation and Confidence Intervals
By Arijit Chatterjee Dr
Random Variables.
Probability and Statistics Chapter 3 Notes
Bayes Net Learning: Bayesian Approaches
Bayes for Beginners Stephanie Azzopardi & Hrvoje Stojic
Binomial Distribution & Bayes’ Theorem
Tutorial #3 by Ma’ayan Fishelson
Chapter 9 Hypothesis Testing.
CONCEPTS OF ESTIMATION
Probability Principles of Engineering
Probability Principles of Engineering
More about Posterior Distributions
Confidence Intervals Chapter 11.
Bayes for Beginners Luca Chech and Jolanda Malamud
CS639: Data Management for Data Science
Probability Probability Principles of EngineeringTM
Probability Principles of Engineering
MAS2317- Introduction to Bayesian Statistics
Mathematical Foundations of BME Reza Shadmehr
Applied Statistics and Probability for Engineers
Presentation transcript:

CS 594: Empirical Methods in HCC Introduction to Bayesian Analysis Dr. Debaleena Chattopadhyay Department of Computer Science debchatt@uic.edu debaleena.com hci.cs.uic.edu

Statistics Statistics the study of uncertainty. How do we measure it? How do we make the decisions in the presence of it? One of the ways to deal with uncertainty, in a more quantified way, is to think about probabilities. While rolling a fair six-sided dice, we may ask what's the probability that the dice shows a four? How about asking is this a fair dice? What is the probability that the dice is fair?

Three frameworks to measure uncertainty Classical framework outcomes that are equally likely have equal probabilities. So in the case of rolling a fair dice, there are six possible outcomes, they're all equally likely. So the probability of rolling a four, on a fair six-sided dice, is just one in six. Frequentist framework have a hypothetical infinite sequence of events, and then look at the relevant frequency, in that hypothetical infinite sequence. In the case of rolling a dice, a fair six-sided dice, think about rolling the dice an infinite number of times. If it's a fair dice, if you roll infinite number of times then one sixth of the time, we'll get a four, showing up. So we can continue to define the probability of rolling four in a six-sided dice as one in six. Bayesian framework Bayesian perspective is one of personal perspective. Your probability represents your own perspective, it's your measure of uncertainty, and it considers what you know about a particular problem. Credits: https://www.coursera.org/learn/bayesian-statistics/home/welcome What about is this a fair dice? The frequentist approach tries to be objective in how it defines probabilities. Sometimes the objectivity is just illusory.

Other Shortcomings in Frequentist Statistics P-values depend on sample size and the sampling distribution. Confidence intervals (C.I) are not probability distributions therefore they do not provide the most probable value for a parameter.

Bayesian -- personal perspective Bayesian inference uses prior knowledge to allocate and reallocate credibility across possibilities. In Bayesian statistics, the interpretation of what probability means is that it is a description of how certain you are that some statement, or proposition, is true. This is inherently a subjective approach to probability, but it can work well in a mathematically rigorous foundation, and it leads to much more intuitive results in many cases than the Frequentist approach. We can quantify probabilities by thinking about what is a fair bet. For example, we want to ask what's the probability it rains tomorrow?

Conditional Probability Conditional probability is when we're trying to consider two events that are related to each other.

Bayes’ Theorem

Example An early test for HIV antibodies known as the ELISA test. It is a pretty accurate test. Over 90% of the time, it'll give you an accurate result. In that case, P(+ / HIV) = 0.977. P(- / no HIV) = 0.926.  A study found that among North American’s, probability that a North American would have HIV was about 0.0026.  If we randomly selected someone from North America and we tested them and they tested positive for HIV, what's the probability that they actually have HIV given they've tested positive.  This is over 90% accurate, what sort of number do you expect to get in this case?

Likelihood Recap Bernoulli distribution used when we have two possible outcomes, such as flipping a coin X ~ B(p), where p is the probability of a success or heads; P(X = 1) = p, P(X = 0) = 1-p f(X = x|p) = f(x|p) = px (1-p)(1-x) It's used when we have two possible outcomes, such as flipping a coin, where  it could be heads and tails, or the cases where we have a success or a failure.

Likelihood Consider a hospital where 400 patients are admitted over a month for heart attacks, and a month later 72 of them have died and 328 of them have survived. What's our estimate of the mortality rate? We must first establish our reference population. Maybe heart attack patients in the region or heart attack patients that are admitted to this hospital. Reasonable, but in this case the actual data are not a random sample from either of those populations. Let’s think about all people in the region who might possibly have a heart attack and might possibly get admitted to this hospital.

Likelihood Say each patient comes from a Bernoulli distribution. Yi ~ B(θ), where θ is unknown. P(Yi = 1) = θ // for all individuals admitted, “success” is mortality What’s the probability density function (PDF) here? Likelihood is the PDF as a function of θ. Maximum likelihood is choosing θ as to maximize the likelihood value Maximum Likelihood Estimate (MLE) is the value of θ

Steps of Bayesian Data Analysis Identify the data relevant to RQs. Which data variables are to be predicted, and which data variables are supposed to act as predictors? Define a descriptive model for the relevant data. The mathematical form and its parameters should be meaningful and appropriate to the theoretical purposes of the analysis. Specify a prior distribution of the parameters. Use Bayesian Inference to re-allocate credibility across parameter values. Interpret and check that the posterior distribution is meaningful. Posterior predictive check: Check that the posterior predictions mimic the data with reasonable accuracy. If not, then consider a different descriptive model.

Parameter Estimation

Posterior Belief Distribution Posterior = Likelihood * Prior / Evidence

High Density Interval (HDI) HDI is formed from the posterior distribution after observing the new data. Since HDI is a probability, the 95% HDI gives the 95% most credible values. It is also guaranteed that 95 % values will lie in this interval unlike C.I.