Likelihood intervals vs. coverage intervals

Slides:



Advertisements
Similar presentations
Bayes rule, priors and maximum a posteriori
Advertisements

27 th March CERN Higgs searches: CL s W. J. Murray RAL.
Bayesian inference “Very much lies in the posterior distribution” Bayesian definition of sufficiency: A statistic T (x 1, …, x n ) is sufficient for 
1 LIMITS Why limits? Methods for upper limits Desirable properties Dealing with systematics Feldman-Cousins Recommendations.
I’ll not discuss about Bayesian or Frequentist foundation (discussed by previous speakers) I’ll try to discuss some facts and my point of view. I think.
1 Methods of Experimental Particle Physics Alexei Safonov Lecture #22.
CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 10: The Bayesian way to fit models Geoffrey Hinton.
Hypothesis testing Week 10 Lecture 2.
Introduction  Bayesian methods are becoming very important in the cognitive sciences  Bayesian statistics is a framework for doing inference, in a principled.
G. Cowan Lectures on Statistical Data Analysis Lecture 12 page 1 Statistical Data Analysis: Lecture 12 1Probability, Bayes’ theorem 2Random variables and.
G. Cowan Lectures on Statistical Data Analysis Lecture 10 page 1 Statistical Data Analysis: Lecture 10 1Probability, Bayes’ theorem 2Random variables and.
Does Naïve Bayes always work?
Lecture 9: p-value functions and intro to Bayesian thinking Matthew Fox Advanced Epidemiology.
Statistical Analysis of Systematic Errors and Small Signals Reinhard Schwienhorst University of Minnesota 10/26/99.
1/2555 สมศักดิ์ ศิวดำรงพงศ์
Distinguishability of Hypotheses S.Bityukov (IHEP,Protvino; INR RAS, Moscow) N.Krasnikov (INR RAS, Moscow ) December 1, 2003 ACAT’2003 KEK, Japan S.Bityukov.
Statistical Decision Theory
Estimating parameters in a statistical model Likelihood and Maximum likelihood estimation Bayesian point estimates Maximum a posteriori point.
Maximum Likelihood Estimator of Proportion Let {s 1,s 2,…,s n } be a set of independent outcomes from a Bernoulli experiment with unknown probability.
VI. Evaluate Model Fit Basic questions that modelers must address are: How well does the model fit the data? Do changes to a model, such as reparameterization,
CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 11: Bayesian learning continued Geoffrey Hinton.
Likelihood function and Bayes Theorem In simplest case P(B|A) = P(A|B) P(B)/P(A) and we consider the likelihood function in which we view the conditional.
Statistics In HEP Helge VossHadron Collider Physics Summer School June 8-17, 2011― Statistics in HEP 1 How do we understand/interpret our measurements.
Lecture 16 Section 8.1 Objectives: Testing Statistical Hypotheses − Stating hypotheses statements − Type I and II errors − Conducting a hypothesis test.
5.1 Chapter 5 Inference in the Simple Regression Model In this chapter we study how to construct confidence intervals and how to conduct hypothesis tests.
Correlation Assume you have two measurements, x and y, on a set of objects, and would like to know if x and y are related. If they are directly related,
Statistical Decision Theory Bayes’ theorem: For discrete events For probability density functions.
Bayes Theorem. Prior Probabilities On way to party, you ask “Has Karl already had too many beers?” Your prior probabilities are 20% yes, 80% no.
Easy Limit Statistics Andreas Hoecker CAT Physics, Mar 25, 2011.
G. Cowan Lectures on Statistical Data Analysis Lecture 4 page 1 Lecture 4 1 Probability (90 min.) Definition, Bayes’ theorem, probability densities and.
G. Cowan Computing and Statistical Data Analysis / Stat 9 1 Computing and Statistical Data Analysis Stat 9: Parameter Estimation, Limits London Postgraduate.
More on regression Petter Mostad More on indicator variables If an independent variable is an indicator variable, cases where it is 1 will.
Bayesian Approach Jake Blanchard Fall Introduction This is a methodology for combining observed data with expert judgment Treats all parameters.
In Bayesian theory, a test statistics can be defined by taking the ratio of the Bayes factors for the two hypotheses: The ratio measures the probability.
G. Cowan Lectures on Statistical Data Analysis Lecture 12 page 1 Statistical Data Analysis: Lecture 12 1Probability, Bayes’ theorem 2Random variables and.
Steps in the Scientific Method 1.Observations - quantitative - qualitative 2.Formulating hypotheses - possible explanation for the observation 3.Performing.
G. Cowan RHUL Physics Statistical Issues for Higgs Search page 1 Statistical Issues for Higgs Search ATLAS Statistics Forum CERN, 16 April, 2007 Glen Cowan.
Ivo van Vulpen Why does RooFit put asymmetric errors on data points ? 10 slides on a ‘too easy’ topic that I hope confuse, but do not irritate you
Naive Bayes Classifier. REVIEW: Bayesian Methods Our focus this lecture: – Learning and classification methods based on probability theory. Bayes theorem.
CSC321: Lecture 8: The Bayesian way to fit models Geoffrey Hinton.
Lecture 1.31 Criteria for optimal reception of radio signals.
arXiv:physics/ v3 [physics.data-an]
Statistics 200 Lecture #9 Tuesday, September 20, 2016
The expected confident intervals for triple gauge coupling parameter
Measurement, Quantification and Analysis
The asymmetric uncertainties on data points in Roofit
Statistical Methods used for Higgs Boson Searches
Does Naïve Bayes always work?
BAYES and FREQUENTISM: The Return of an Old Controversy
Naive Bayes Classifier
Bayesian data analysis
Two Interpretations of What it Means to Normalize the Low Energy Monte Carlo Events to the Low Energy Data Atms MC Atms MC Data Data Signal Signal Apply.
Introduction to estimation: 2 cases
Department of Civil and Environmental Engineering
Bayes for Beginners Stephanie Azzopardi & Hrvoje Stojic
Measuring Repeat and Near-Repeat Burglary Effects
Lecture 4 1 Probability (90 min.)
chance Learning impeded by two processes: Bias , Chance
Incorporating systematic uncertainties into upper limits
Statistical Methods For Engineers
Goals of Psychology!.
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Lecture 4 1 Probability Definition, Bayes’ theorem, probability densities and their properties, catalogue of pdfs, Monte Carlo 2 Statistical tests general.
Wrap up points Reasonableness of answers
STA 291 Spring 2008 Lecture 18 Dustin Lueker.
Chapter 7: The Normality Assumption and Inference with OLS
LECTURE 23: INFORMATION THEORY REVIEW
Bayes for Beginners Luca Chech and Jolanda Malamud
Lecture 4 1 Probability Definition, Bayes’ theorem, probability densities and their properties, catalogue of pdfs, Monte Carlo 2 Statistical tests general.
Mathematical Foundations of BME Reza Shadmehr
Presentation transcript:

Likelihood intervals vs. coverage intervals Günter Zech Likelihood intervals vs. coverage intervals (What is better: exclude parameter with high likelihood by coverage interval or have bad coverage in a likelihood ratio interval?) Likelihood ratio Coverage Parameters are included if observation has relatively high p.d.f. (competition) add log-likelihoods  single interval observation has high g.o.f.-probability (no competition) About 68 % of intervals contain true value. They cannot be combined.

Example: H1: N(7.5, 2.5) H2: N(50, 100) t=10.1  LR=26 in favor of H1 Observaten is off by >1 st.dev of H1 prediction but only half a st.dev off the H2 prediction measurement H1: N(7.5, 2.5) H2: N(50, 100) t=10.1  LR=26 in favor of H1 Likelihood ratio favors hypothesis 1 Frequentist view excludes hypothesis 1 at 1 st. dev. but supports hypothesis 2

Continuous version: Assume following theory: (includes discrete hypotheses H1 and H2 for specific values of m) H1 The likelihood ratio limit does not cover. The coverage inteval excludes hig likelihood values. What is preferable?

Another illustration: Asume earthquake theory following a similar formula as shown in the previous slide with different constants, and specific parameter values: H1: predicts eathquake at 2008, Feb. 13 +- 1 day H2: predicts eathquake at 2010, Feb. 13 +- 2 years Assume earthquake happens at 2008, Feb. 15 Frequentist would exclude H1 by 2 st.dev.! It is clear that H1 should not be excluded but why exclude H2 even though the observation is only half a standard deviation off the prediction? Answer: Only one parameter is correct. If the parameter corresponding to H1 is very likely, H2 must be unlikely.

They ignore the precision of a prediction. Conclusion Problems with coverage intervals: They do not take into account that parameter values are exclusive. If Parameter 1 applies parameter 2 cannot. They ignore the precision of a prediction. The contain only part of the information provided by the data. All this is due to the violation of the Likelihood Principle We should not require likelihood ratio intervals to cover. Coverage intervals excluding regions of high likelihood are very problematic.

Some complementary remarks Overcoverage There is nothing wrong with conservative upper limits. (It is better to present a conservative limit than a doubtfull one) Including systematic errors More specifically: calibration uncertainties, theoretical uncertainties, uncertainties which are not improving with 1/sqrt(N) The only reasonable way is to invent a Bayesian p.d.f. and to integrate out the corresponding parameter. For upper limits one should be conservative. (It is better to overestimate the errors than to underestimate them. Priors Priors should incorporate prior knowledge and not be selected such that they produce the output you like.