Statistical Analysis of Systematic Errors and Small Signals Reinhard Schwienhorst University of Minnesota 10/26/99.

Slides:



Advertisements
Similar presentations
Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.
Advertisements

Estimation, Variation and Uncertainty Simon French
Probability and Statistics Basic concepts II (from a physicist point of view) Benoit CLEMENT – Université J. Fourier / LPSC
Psychology 290 Special Topics Study Course: Advanced Meta-analysis April 7, 2014.
Statistics review of basic probability and statistics.
1 Methods of Experimental Particle Physics Alexei Safonov Lecture #21.
I’ll not discuss about Bayesian or Frequentist foundation (discussed by previous speakers) I’ll try to discuss some facts and my point of view. I think.
Copyright 2004 David J. Lilja1 Errors in Experimental Measurements Sources of errors Accuracy, precision, resolution A mathematical model of errors Confidence.
Statistics for Particle Physics: Intervals Roger Barlow Karlsruhe: 12 October 2009.
Probability theory Much inspired by the presentation of Kren and Samuelsson.
1 Nuisance parameters and systematic uncertainties Glen Cowan Royal Holloway, University of London IoP Half.
G. Cowan Lectures on Statistical Data Analysis Lecture 12 page 1 Statistical Data Analysis: Lecture 12 1Probability, Bayes’ theorem 2Random variables and.
G. Cowan Lectures on Statistical Data Analysis Lecture 10 page 1 Statistical Data Analysis: Lecture 10 1Probability, Bayes’ theorem 2Random variables and.
G. Cowan Lectures on Statistical Data Analysis 1 Statistical Data Analysis: Lecture 7 1Probability, Bayes’ theorem, random variables, pdfs 2Functions of.
Principles of the Global Positioning System Lecture 10 Prof. Thomas Herring Room A;
Standard error of estimate & Confidence interval.
Regression Analysis Regression analysis is a statistical technique that is very useful for exploring the relationships between two or more variables (one.
1 1 Slide Statistical Inference n We have used probability to model the uncertainty observed in real life situations. n We can also the tools of probability.
10.3 Estimating a Population Proportion
Chap 8-1 Copyright ©2013 Pearson Education, Inc. publishing as Prentice Hall Chapter 8 Confidence Interval Estimation Business Statistics: A First Course.
1 Probability and Statistics  What is probability?  What is statistics?
Statistical Decision Theory
Chapter 8 Probability Section R Review. 2 Barnett/Ziegler/Byleen Finite Mathematics 12e Review for Chapter 8 Important Terms, Symbols, Concepts  8.1.
Harrison B. Prosper Workshop on Top Physics, Grenoble Bayesian Statistics in Analysis Harrison B. Prosper Florida State University Workshop on Top Physics:
Dr. Gary Blau, Sean HanMonday, Aug 13, 2007 Statistical Design of Experiments SECTION I Probability Theory Review.
Section 8.1 Estimating  When  is Known In this section, we develop techniques for estimating the population mean μ using sample data. We assume that.
Introduction Osborn. Daubert is a benchmark!!!: Daubert (1993)- Judges are the “gatekeepers” of scientific evidence. Must determine if the science is.
LECTURER PROF.Dr. DEMIR BAYKA AUTOMOTIVE ENGINEERING LABORATORY I.
Lab 3b: Distribution of the mean
G. Cowan Lectures on Statistical Data Analysis Lecture 1 page 1 Lectures on Statistical Data Analysis London Postgraduate Lectures on Particle Physics;
Statistical Data Analysis and Simulation Jorge Andre Swieca School Campos do Jordão, January,2003 João R. T. de Mello Neto.
Practical Statistics for Particle Physicists Lecture 3 Harrison B. Prosper Florida State University European School of High-Energy Physics Anjou, France.
Bayesian vs. frequentist inference frequentist: 1) Deductive hypothesis testing of Popper--ruling out alternative explanations Falsification: can prove.
Statistics In HEP Helge VossHadron Collider Physics Summer School June 8-17, 2011― Statistics in HEP 1 How do we understand/interpret our measurements.
A taste of statistics Normal error (Gaussian) distribution  most important in statistical analysis of data, describes the distribution of random observations.
Section 10.1 Confidence Intervals
5.1 Chapter 5 Inference in the Simple Regression Model In this chapter we study how to construct confidence intervals and how to conduct hypothesis tests.
Confidence Intervals First ICFA Instrumentation School/Workshop At Morelia, Mexico, November 18-29, 2002 Harrison B. Prosper Florida State University.
Statistical Decision Theory Bayes’ theorem: For discrete events For probability density functions.
PROBABILITY IN OUR DAILY LIVES
2. Introduction to Probability. What is a Probability?
Sampling and estimation Petter Mostad
1 Introduction to Statistics − Day 4 Glen Cowan Lecture 1 Probability Random variables, probability densities, etc. Lecture 2 Brief catalogue of probability.
1 Methods of Experimental Particle Physics Alexei Safonov Lecture #24.
G. Cowan Lectures on Statistical Data Analysis Lecture 8 page 1 Statistical Data Analysis: Lecture 8 1Probability, Bayes’ theorem 2Random variables and.
Analysis of Experimental Data; Introduction
1 Introduction to Statistics − Day 3 Glen Cowan Lecture 1 Probability Random variables, probability densities, etc. Brief catalogue of probability densities.
G. Cowan Lectures on Statistical Data Analysis Lecture 4 page 1 Lecture 4 1 Probability (90 min.) Definition, Bayes’ theorem, probability densities and.
G. Cowan Computing and Statistical Data Analysis / Stat 9 1 Computing and Statistical Data Analysis Stat 9: Parameter Estimation, Limits London Postgraduate.
2005 Unbinned Point Source Analysis Update Jim Braun IceCube Fall 2006 Collaboration Meeting.
SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS. SAMPLING AND SAMPLING VARIATION Sample Knowledge of students No. of red blood cells in a person Length of.
SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS. SAMPLING AND SAMPLING VARIATION Sample Knowledge of students No. of red blood cells in a person Length of.
G. Cowan Lectures on Statistical Data Analysis Lecture 12 page 1 Statistical Data Analysis: Lecture 12 1Probability, Bayes’ theorem 2Random variables and.
Systematics in Hfitter. Reminder: profiling nuisance parameters Likelihood ratio is the most powerful discriminant between 2 hypotheses What if the hypotheses.
A New Upper Limit for the Tau-Neutrino Magnetic Moment Reinhard Schwienhorst      ee ee
Confidence Intervals and Hypothesis Testing Mark Dancox Public Health Intelligence Course – Day 3.
10.1 Estimating with Confidence Chapter 10 Introduction to Inference.
Canadian Bioinformatics Workshops
Confidence Intervals Lecture 2 First ICFA Instrumentation School/Workshop At Morelia, Mexico, November 18-29, 2002 Harrison B. Prosper Florida State University.
Confidence Intervals Cont.
Physics 114: Lecture 13 Probability Tests & Linear Fitting
(5) Notes on the Least Squares Estimate
BAYES and FREQUENTISM: The Return of an Old Controversy
Ex1: Event Generation (Binomial Distribution)
Lecture 4 1 Probability (90 min.)
Statistical NLP: Lecture 4
Lecture 4 1 Probability Definition, Bayes’ theorem, probability densities and their properties, catalogue of pdfs, Monte Carlo 2 Statistical tests general.
Lecture 4 1 Probability Definition, Bayes’ theorem, probability densities and their properties, catalogue of pdfs, Monte Carlo 2 Statistical tests general.
CS639: Data Management for Data Science
Statistics for HEP Roger Barlow Manchester University
Presentation transcript:

Statistical Analysis of Systematic Errors and Small Signals Reinhard Schwienhorst University of Minnesota 10/26/99

Reinhard Schwienhorst 2 Outline Introduction Statistics –Introduction –Philosophies Confidence limit and systematic uncertainty –Typical Problem –What is a 90% confidence limit? –Feldman-Cousins method –Bayesian method Systematic uncertainty –Bayesian –Frequentist –Cousins-Highland Conclusions

Reinhard Schwienhorst 3 Introduction Physics experiments are usually out to –Discover something Find events that cannot be explained by the standard model Find a few events above a background ® Statistics of small numbers –Measure something very precisely Analyze many events in detail Have very good control over the experiment ® Systematic uncertainty

Reinhard Schwienhorst 4 Probability to observe x in a single measurement is P(x|  true ) –  true stands for all parameters that describe the probability distribution function (Poisson: mean, Gaussian: mean and sigma) Bayes theorem: For two sets A and B: P(A|B) P(B) = P(B|A) P(A) – P(A) is the probability to find A (out of all the possible events in the world) –P(B|A) is the probability to find B in the set A Statistics Introduction AB

Reinhard Schwienhorst 5 The limiting relative frequency of a certain outcome: True values can never be determined precisely Includes several assumptions –Experiment is repeatable, paremeters don’t change, each measurement has the same probability,... Statistics Philosophies Subjective: Intuitive definition Included frequentist statement Make statements about the probability of a true value with Bayes theorem: FrequentistBayesian Probability is:

Reinhard Schwienhorst 6 Typical Problem Search for events generated in a non-Standard Model process The number of observed events is given by where: –n found : the found number of events, probabilistic variable –n se : the number of signal events, fixed but unknown –S: the sensitivity, with systematic uncertainty –n bg : the number of background events due to ordinary SM processes, fixed and known The experiment tries to determine if n se  0 Usually a 90% confidence interval is given

Reinhard Schwienhorst 7 What is a 90% Confidence Interval? If I repeat an experiment many times (and create a confidence interval in each experiment), the true value  t will lie inside the interval 90% of the time. Statement about many (hypothetical) experiments We (Physicists) like to argue in Frequentist terms –We try to convince others in Frequentist language If I observe x 0 in a single experiment, 90% of the possible values for the true value  t lie inside the Bayesian interval. Statement about the true value We (Physicists) like to think and feel in Bayesian terms –We form our own opinion with Bayesian intuition FrequentistBayesian

Reinhard Schwienhorst 8 Frequentist philosophy For each possible theoretical value for n se, find the distribution of n found, n bg and S are fixed For each of those distributions, find the 90% acceptance interval –Add n found values to the interval I until P(I|n se )  90% –Use the likelihood ratio P(n found |n se )/P(n found |n se,best ) to decide which values to add next Make a plot of n se vs I (confidence belt) Open the box, take out the experimental result (n obs ) and read the confidence interval off the belt Feldman-Cousins Method

Reinhard Schwienhorst 9 Feldman-Cousins Method Poisson distribution and likelihood ratio * : Poisson distribution,  =3  : distribution of the “best” value  : ratio R Confidence belt

Reinhard Schwienhorst 10 P(n found |n se,S,n bg ) is a Poisson distribution f P (n found ) Likelihood to find n se is Depends on the prior belief in n se, expressed through f 0 (n se ) –Usually f 0 (n se )=constant For large values of n se, the result is identical to the frequentist method Bayesian Solution

Reinhard Schwienhorst 11 What is Systematic Uncertainty? Not an error Not clear, possibilities: aS changes from event to event, due to variation in –pulseheight, position, temperature, … bS changes from experiment to experiment –S determined by MC –Uncertainty due to finite # of MC events cUncertainty about S (?) Not an error Degree of belief in S I think S is 0.8, but I am not sure It expresses how well we understand our detector FrequentistBayesian Example: sensitivity

Reinhard Schwienhorst 12 Solve as before; also integrate over the uncertainty in S: f 0 (n se,S) includes a distribution for S –in our example a Gaussian with mean 0.8 and width 0.1 S is called an influence variable –it affects the result, but we don’t care what it is Bayesian Treatment of Systematic Uncertainty

Reinhard Schwienhorst 13 Problems with the Bayesian Treatment What should the prior f(n e,  S,  S ) be? –The experimenter should decide What is f(S mean,  S |  S,true,  S,true )? –Typically, one sets  S =  S,true and assumes a Gaussian distribution for f(S mean |  S,true,  S,true ) What is the interpretation of f(S mean,  S |  S,true,  S,true )? –Degree of belief –Level of trust in S mean

Reinhard Schwienhorst 14 Frequentist Treatment of Systematic Uncertainty Should we even do this? If we don’t know the sensitivity of our detector, we should –improve our calibration –make more tests –generate and analyze more MC events –not publish our results (?) If we know the sensitivity of our detector, it is not a probabilistic variable and we should use the (known) value.

Reinhard Schwienhorst 15 Treatment of Frequentist Systematic Uncertainty a a Event-to-event variation interpretation: –Use calibration data to get the response variations of detector elements Testbeam Cosmic ray tests Subsets of real data –Put variations in pulseheight, position resolution, … into the MC simulation –Use MC simulation to get the average S –Generate the confidence belt according to

Reinhard Schwienhorst 16 Treatment of Frequentist Systematic Uncertainty b Global variation interpretation –Uncertainty in flux normalization, # of MC events The probability distribution for S is a Gaussian with mean  and width  In the experiment, we obtain one measurement for S and an uncertainty ® Estimate mean and width from the experiment ® Create a multidimensional confidence belt to get confidence intervals for n se and S (covariance matrix) –But we don’t care about a limit for S –quote the worst-case interval for n se –or average (integrate) over S –or ?

Reinhard Schwienhorst 17 Another View of Systematic Uncertainty Conditions change through the run of an experiment –The experiment is practically not “repeatable” Event-to-event uncertainties are usually built into the MC –GEANT, detector simulation Control data is used to estimate the global uncertainty in S –In E872, use  CC events to check the  CC signal –If control data agrees with control MC (based on statistical tests), the systematic uncertainty is at most small uncertainty due to # of control events is not a systematic error generate enough control data to make the uncertainty negligible –But : comparing MC and data is really a matter of belief how do we deal with flux normalization etc.?

Reinhard Schwienhorst 18 Frequentist treatment of the Poisson variable Bayesian treatment of the systematic uncertainty Average (integrate) Then use Frequentist methods to get a confidence interval Cousins-Highland solution

Reinhard Schwienhorst 19 Coverage Frequentist concept For a true value inside my 90% confidence interval: – Generate many experiments according to the probability distribution. –For each experiment, find the confidence interval. –The sum of all probabilities for intervals that contain the true value should be at least 0.9 (the intervals should cover the true value) Simplified: The true value will be covered by the confidence interval in 90% of the experiments trying to measure it. Bayesians don’t care about coverage

Reinhard Schwienhorst 20 Conclusions Systematic errors should be minimized as much as possible If they are non-negligible, they should be handled (interpreted) with caution Different methods exist for dealing with small signals The statistical analysis of small signals is under debate (Frequentist  Bayesian) There will be a workshop at CERN in January ( )