1 Practical Statistics for Physicists CERN Summer Students July 2014 Louis Lyons Imperial College and Oxford CMS expt at LHC

Slides:



Advertisements
Similar presentations
Experimental Measurements and their Uncertainties
Advertisements

DO’S AND DONT’S WITH LIKELIHOODS Louis Lyons Oxford (CDF)
U eatworms.swmed.edu/~leon u
1 χ 2 and Goodness of Fit Louis Lyons IC and Oxford SLAC Lecture 2’, Sept 2008.
1 Do’s and Dont’s with L ikelihoods Louis Lyons Oxford and IC CDF IC, February 2008.
1 Practical Statistics for Physicists Sardinia Lectures Oct 2008 Louis Lyons Imperial College and Oxford CDF experiment at FNAL CMS expt at LHC
1 χ 2 and Goodness of Fit Louis Lyons Oxford Mexico, November 2006 Lecture 3.
1 Practical Statistics for Physicists LBL January 2008 Louis Lyons Oxford
1 Do’s and Dont’s with L ikelihoods Louis Lyons Oxford CDF Mexico, November 2006.
1 Practical Statistics for Physicists Dresden March 2010 Louis Lyons Imperial College and Oxford CDF experiment at FNAL CMS expt at LHC
Programme in Statistics (Courses and Contents). Elementary Probability and Statistics (I) 3(2+1)Stat. 101 College of Science, Computer Science, Education.
Level 1 Laboratories University of Surrey, Physics Dept, Level 1 Labs, Oct 2007 Handling & Propagation of Errors : A simple approach 1.
1 Do’s and Dont’s with L ikelihoods Louis Lyons Oxford CDF CERN, October 2006.
Introduction to experimental errors
1 Practical Statistics for Physicists SLUO Lectures February 2007 Louis Lyons Oxford
1 Practical Statistics for Particle Physicists CERN Academic Lectures October 2006 Louis Lyons Oxford
Probability and Probability Distributions
Lec 6, Ch.5, pp90-105: Statistics (Objectives) Understand basic principles of statistics through reading these pages, especially… Know well about the normal.
1 Do’s and Dont’s with L ikelihoods Louis Lyons Oxford CDF Manchester, 16 th November 2005.
Statistical Analysis Pedro Flores. Conditional Probability The conditional probability of an event B is the probability that the event will occur given.
Lecture 6: Descriptive Statistics: Probability, Distribution, Univariate Data.
1 Practical Statistics for Physicists Stockholm Lectures Sept 2008 Louis Lyons Imperial College and Oxford CDF experiment at FNAL CMS expt at LHC
TOPLHCWG. Introduction The ATLAS+CMS combination of single-top production cross-section measurements in the t channel was performed using the BLUE (Best.
Problem A newly married couple plans to have four children and would like to have three girls and a boy. What are the chances (probability) their desire.
Physics 114: Lecture 15 Probability Tests & Linear Fitting Dale E. Gary NJIT Physics Department.
LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 1 Lecture 5 Slide 1 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall.
Chapter 5 Sampling Distributions
CA200 Quantitative Analysis for Business Decisions.
PBG 650 Advanced Plant Breeding
1 Probability and Statistics  What is probability?  What is statistics?
Probability theory 2 Tron Anders Moger September 13th 2006.
Statistics for Engineer Week II and Week III: Random Variables and Probability Distribution.
Random Sampling, Point Estimation and Maximum Likelihood.
Theory of Probability Statistics for Business and Economics.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 11 Section 1 – Slide 1 of 34 Chapter 11 Section 1 Random Variables.
1 Methods of Experimental Particle Physics Alexei Safonov Lecture #23.
Bernoulli Trials Two Possible Outcomes –Success, with probability p –Failure, with probability q = 1  p Trials are independent.
BINOMIALDISTRIBUTION AND ITS APPLICATION. Binomial Distribution  The binomial probability density function –f(x) = n C x p x q n-x for x=0,1,2,3…,n for.
Lab 3b: Distribution of the mean
Lecture 2 Forestry 3218 Lecture 2 Statistical Methods Avery and Burkhart, Chapter 2 Forest Mensuration II Avery and Burkhart, Chapter 2.
Statistics Lectures: Questions for discussion Louis Lyons Imperial College and Oxford CERN Summer Students July
June 11, 2008Stat Lecture 10 - Review1 Midterm review Chapters 1-5 Statistics Lecture 10.
1 2 nd Pre-Lab Quiz 3 rd Pre-Lab Quiz 4 th Pre-Lab Quiz.
LECTURE 3: ANALYSIS OF EXPERIMENTAL DATA
1 Practical Statistics for Physicists CERN Latin American School March 2015 Louis Lyons Imperial College and Oxford CMS expt at LHC
Conditional Probability Mass Function. Introduction P[A|B] is the probability of an event A, giving that we know that some other event B has occurred.
Sampling Distribution WELCOME to INFERENTIAL STATISTICS.
BAYES and FREQUENTISM: The Return of an Old Controversy 1 Louis Lyons Imperial College and Oxford University CERN Summer Students July 2014.
Sampling and Statistical Analysis for Decision Making A. A. Elimam College of Business San Francisco State University.
1 χ 2 and Goodness of Fit & L ikelihood for Parameters Louis Lyons Imperial College and Oxford CERN Summer Students July 2014.
R. Kass/Sp07P416/Lecture 71 More on Least Squares Fit (LSQF) In Lec 5, we discussed how we can fit our data points to a linear function (straight line)
THE NORMAL DISTRIBUTION
Biostatistics Class 3 Probability Distributions 2/15/2000.
Probability Distributions ( 확률분포 ) Chapter 5. 2 모든 가능한 ( 확률 ) 변수의 값에 대해 확률을 할당하는 체계 X 가 1, 2, …, 6 의 값을 가진다면 이 6 개 변수 값에 확률을 할당하는 함수 Definition.
Practical Statistics for Physicists
Statistical Methods used for Higgs Boson Searches
Statistical Modelling
χ2 and Goodness of Fit & Likelihood for Parameters
Probability and Statistics for Particle Physics
Practical Statistics for Physicists
Practical Statistics for Physicists
BAYES and FREQUENTISM: The Return of an Old Controversy
Stat 31, Section 1, Last Time Sampling Distributions
Discrete Probability Distributions
Probability & Statistics Probability Theory Mathematical Probability Models Event Relationships Distributions of Random Variables Continuous Random.
Chapter 5 Sampling Distributions
Statistical Thinking and Applications
1/2555 สมศักดิ์ ศิวดำรงพงศ์
Introductory Statistics
Presentation transcript:

1 Practical Statistics for Physicists CERN Summer Students July 2014 Louis Lyons Imperial College and Oxford CMS expt at LHC

2 Topics 1) Introduction 2) Bayes and Frequentism 3) χ 2 and L ikelihoods 4) Higgs: Example of Search for New Physics Time for discussion

3 Introductory remarks What is Statistics? Probability and Statistics Why errors? Random and systematic errors Combining errors Combining experiments Binomial, Poisson and Gaussian distributions

4 What do we do with Statistics? Parameter Determination (best value and range) e.g. Mass of Higgs = 80  2 Goodness of Fit Does data agree with our theory? Hypothesis Testing Does data prefer Theory 1 to Theory 2? (Decision Making What experiment shall I do next?) Why bother? HEP is expensive and time-consuming so Worth investing effort in statistical analysis  better information from data

5 Probability and Statistics Example: Dice Given P(5) = 1/6, what is Given 20 5’s in 100 trials, P(20 5’s in 100 trials)? what is P(5)? And its error? If unbiassed, what is Given 60 evens in 100 trials, P(n evens in 100 trials)? is it unbiassed? Or is P(evens) =2/3? THEORY  DATA DATA  THEORY

6 Probability and Statistics Example: Dice Given P(5) = 1/6, what is Given 20 5’s in 100 trials, P(20 5’s in 100 trials)? what is P(5)? And its error? Parameter Determination If unbiassed, what is Given 60 evens in 100 trials, P(n evens in 100 trials)? is it unbiassed? Goodness of Fit Or is P(evens) =2/3? Hypothesis Testing N.B. Parameter values not sensible if goodness of fit is poor/bad

7 Why do we need errors? Affects conclusion about our result e.g. Result / Theory = If ± 0.050, data compatible with theory If ± 0.005, data incompatible with theory If ± 0.7, need better experiment Historical experiment at Harwell testing General Relativity

8 Random + Systematic Errors Random/Statistical: Limited accuracy, Poisson counts Spread of answers on repetition (Method of estimating) Systematics: May cause shift, but not spread e.g. Pendulum g = 4π 2 L/  2,  = T/n Statistical errors: T, L Systematics: T, L Calibrate: Systematic  Statistical More systematics: Formula for undamped, small amplitude, rigid, simple pendulum Might want to correct to g at sea level: Different correction formulae Ratio of g at different locations: Possible systematics might cancel. Correlations relevant

9 Presenting result Quote result as g ± σ stat ± σ syst Or combine errors in quadrature  g ± σ Other extreme: Show all systematic contributions separately Useful for assessing correlations with other measurements Needed for using: improved outside information, combining results using measurements to calculate something else.

10 Combining errors z = x - y δz = δx – δy [1] Why σ z 2 = σ x 2 + σ y 2 ? [2]

11 Rules for different functions 1)Linear: z = k 1 x 1 + k 2 x 2 + ……. σ z = k 1 σ 1 & k 2 σ 2 & means “combine in quadrature” N.B. Fractional errors NOT relevant e.g. z = x – y z = your height x = position of head wrt moon y = position of feet wrt moon x and y measured to 0.1% z could be -30 miles

12 Rules for different functions 2) Products and quotients z = x α y β ……. σ z /z = α σ x /x & β σ y /y Useful for x 2, xy, x/√y,…….

13 3) Anything else: z = z(x 1, x 2, …..) σ z = ∂z/∂x 1 σ 1 & ∂z/∂x 2 σ 2 & ……. OR numerically: z 0 = z(x 1, x 2, x 3 ….) z 1 = z(x 1 +σ 1, x 2, x 3 ….) z 2 = z(x 1, x 2 + σ 2, x 3 ….) σ z = (z 1 -z 0 ) & (z 2 -z 0 ) & …. N.B. All formulae approximate (except 1)) – assumes small errors

14 N.B. Better to combine data! BEWARE 100±10 2±1? 1±1 or 50.5±5?

15 Difference between averaging and adding Isolated island with conservative inhabitants How many married people ? Number of married men = 100 ± 5 K Number of married women = 80 ± 30 K Total = 180 ± 30 K Wtd average = 99 ± 5 K CONTRAST Total = 198 ± 10 K GENERAL POINT: Adding (uncontroversial) theoretical input can improve precision of answer Compare “kinematic fitting”

16 Binomial Distribution Fixed N independent trials, each with same prob of success p What is prob of s successes? e.g. Throw dice 100 times. Success = ‘6’. What is prob of 0, 1,…. 49, 50, 51,… 99, 100 successes? Effic of track reconstrn = 98%. For 500 tracks, prob that 490, 491, , 500 reconstructed. Ang dist is cos θ? Prob of 52/70 events with cosθ > 0 ? (More interesting is statistics question)

17 P s = N! p s (1-p) N-s, as is obvious (N-s)! s! Expected number of successes = ΣsP s = Np, as is obvious Variance of no. of successes = Np(1-p) Variance ~ Np, for p~0 ~ N(1-p) for p~1 NOT Np in general. NOT s ±√s e.g. 100 trials, 99 successes, NOT 99 ± 10

18 Statistics: Estimate p and σ p from s (and N) p = s/N σ p 2 = 1/N s/N (1 – s/N) If s = 0, p = 0 ± 0 ? If s = 1, p = 1.0 ± 0 ? Limiting cases: ● p = const, N  ∞: Binomial  Gaussian μ = Np, σ 2 = Np(1-p) ● N  ∞, p  0, Np = const: Binomial  Poisson μ = Np, σ 2 = Np {N.B. Gaussian continuous and extends to -∞}

19 Binomial Distributions

20 Poisson Distribution Prob of n independent events occurring in time t when rate is r (constant) e.g. events in bin of histogram NOT Radioactive decay for t ~ τ Limit of Binomial (N  ∞, p  0, Np  μ) P n = e -r t (r t) n /n! = e - μ μ n /n! (μ = r t) = r t = μ (No surprise!) σ 2 n = μ “n ±√n” BEWARE 0 ± 0 ? μ  ∞: Poisson  Gaussian, with mean = μ, variance =μ Important for χ 2

21 For your thought Poisson P n = e - μ μ n /n! P 0 = e – μ P 1 = μ e –μ P 2 = μ 2 /2 e -μ For small μ, P 1 ~ μ, P 2 ~ μ 2 /2 If probability of 1 rare event ~ μ, why isn’t probability of 2 events ~ μ 2 ?

22 Approximately Gaussian Poisson Distributions

23 Gaussian or Normal Significance of σ i) RMS of Gaussian = σ (hence factor of 2 in definition of Gaussian) ii) At x = μ±σ, y = y max /√e ~0.606 y max (i.e. σ = half-width at ‘half’-height) iii) Fractional area within μ±σ = 68% iv) Height at max = 1/( σ√2 π)

Area in tail(s) of Gaussian

25 Relevant for Goodness of Fit

26