# Luca Stanco – INFN - Padova2 December 2010 1 Combining p-values i.e. what happens to SIGNIFICANCE when next event comes ? There are two ways: 1) difficult,

## Presentation on theme: "Luca Stanco – INFN - Padova2 December 2010 1 Combining p-values i.e. what happens to SIGNIFICANCE when next event comes ? There are two ways: 1) difficult,"— Presentation transcript:

Luca Stanco – INFN - Padova2 December 2010 1 Combining p-values i.e. what happens to SIGNIFICANCE when next event comes ? There are two ways: 1) difficult, correct 2) easy, approximate Frequentist way Bayesian way

Luca Stanco – INFN - Padova2 December 2010 2 I assume that everybody knows what are - p-values - H 0 /H 1 hypothesis (otherwise please refer to e.g. http://pdg.lbl.gov/2010/reviews/rpp2010-rev-statistics.pdf ) For a short cut: p-value = probability of less probable region of H 0 hypothesis 1-p = Significance of the H 1 hypthesis (power 1- , error of type II) (only in case of 1 random variable !!! )

Luca Stanco – INFN - Padova2 December 2010 3 1rst way

Luca Stanco – INFN - Padova2 December 2010 4

Luca Stanco – INFN - Padova2 December 2010 5

Luca Stanco – INFN - Padova2 December 2010 6 Excercise: suppose the 2° event owns similar p-value than the 1rst one 2.98 sigmas Of course, with the FISHER rule we forgot about any correlation! Moreover is somehow wrong in case of 2 p-values quite different: p 1 = 0.1 p 2 =0.0001 → p TOT = 0.00012 > p 2

Luca Stanco – INFN - Padova2 December 2010 7 It turns out that the FISHER rule is too conservative in case of two independent Poissonians, being the lowest limiting p-value: In the simplest case of no correlation, with 2 candidates as before, the result provides: 3.39 sigmas BUT the final result should be even greater since that probability is: This is a simple demonstration that the FISHER rule is CONSERVATIVE and no so good for Discrete Cases

Luca Stanco – INFN - Padova2 December 2010 8 WHY it is “difficult” the Bayesian way ? If we simulate 1 million of pseudo-experiments for 1candidate, for 2 candidates a priori we should simulate (1 million) 2 = 10 12 !! Some tricks may be applies by - Integrating the likelihood over a “normal domain” (simply connected) - Computing 1-p - Decoupling variables as much as possible (this is formally correct) Then, a Multivariate Likelihood computation is affordable.

Luca Stanco – INFN - Padova2 December 2010 9 In the example of the simplest OPERA case the correct result is: 3.60 sigmas 98.22%1.77%0.01% 98.22% 1.77% 0.01% 96.452%1.739% 1.742% 0.018% 0.031%0% Error due to limited exps.

Luca Stanco – INFN - Padova2 December 2010 10

Luca Stanco – INFN - Padova2 December 2010 11 Backup

Luca Stanco – INFN - Padova2 December 2010 12 Feldman-Cousins is “no meaning” in case of few events (<5) and more than 1 random variable Junk may be used (Modified Frequentist Technique): (arXiv:hep-ex/9902006v1 5 Feb 1999) Valid only for fully independent searches For example it is used by D0 for the Higgs search but: - CDF uses Bayes - the two methods agree within 10% on the single channel and 1% overall - Tevatron decided to release the official result based on the CDF/Bayes analysis.

Download ppt "Luca Stanco – INFN - Padova2 December 2010 1 Combining p-values i.e. what happens to SIGNIFICANCE when next event comes ? There are two ways: 1) difficult,"

Similar presentations