Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Perspective of SCMA IV from Particle Physics Louis Lyons Particle Physics, Oxford (CDF experiment, Fermilab) SCMA IV Penn State.

Similar presentations


Presentation on theme: "1 Perspective of SCMA IV from Particle Physics Louis Lyons Particle Physics, Oxford (CDF experiment, Fermilab) SCMA IV Penn State."— Presentation transcript:

1 1 Perspective of SCMA IV from Particle Physics Louis Lyons Particle Physics, Oxford (CDF experiment, Fermilab) l.lyons@physics.ox.ac.uk SCMA IV Penn State 15 th June 2006

2 2 Topics Basic Particle Physics analyses Similarities between Particles and Astrophysics issues Differences What Astrophysicists do particularly well What Particle Physicists have learnt Conclusions

3 3 Particle Physics What it is Typical experiments Typical data Typical analysis

4 4

5 5 Typical Experiments Experiment Energy Beams # events Result LEP 200 GeV e+ e- 10 7 Z N = 2.987± 0.008 BaBar/Belle 10 GeV e+ e- 10 8 B anti-B CP-violation Tevatron 2000 GeV p anti-p “10 14 ” SUSY? LHC 14000 GeV p p (2007…) Higgs? K  K ~3 GeV ν μ 100 ν oscillations

6 6

7 7 K2K -from KEK to Kamioka- Long Baseline Neutrino Oscillation Experiment

8 8 CDF at Fermilab

9 9 Typical Analysis Parameter determination: dn/dt = 1/τ * exp(-t / τ) Worry about backgrounds, t resolution, t-dependent efficiency 1) Reconstruct tracks 2) Select real events 3) Select wanted events 4) Extract t from L and v 5) Model signal and background 6) Likelihood fit for lifetime and statistical error 7) Estimate systematic error τ ± σ τ (stat) ± σ τ (syst)

10 10 Typical Analysis Hypothesis testing: Peak or statistical fluctuation?

11 11 Similarities Large data sets {ATLAS: event = Mbyte; total = 10 Pbytes} Experimental resolution Systematics Separating signal from background Parameter estimation Testing models (versus alternative?) Search for signals: Setting limits or Discovery SCMA and PHYSTAT

12 12 Differences Bayes or Frequentism? Background Specific Astrophysics issues Time dependence Spatial structures Correlations Non-parametric methods Visualisation Cosmic variance Blind analyses `

13 13 Bayesian versus Frequentism Basis of method Bayes Theorem  Posterior probability distribution Uses pdf for data, for fixed parameters Meaning of probability Degree of beliefFrequentist definition Prob of parameters? YesAnathema Needs prior?YesNo Choice of interval? YesYes (except F+C) Data considered Only data you have...+ other possible data Likelihood principle? YesNo Bayesian Frequentist

14 14 Bayesian versus Frequentism Ensemble of experiment NoYes (but often not explicit) Final statement Posterior probability distribution Parameter values  Data is likely Unphysical/ empty ranges Excluded by priorCan occur SystematicsIntegrate over priorExtend dimensionality of frequentist construction CoverageUnimportantBuilt-in Decision making Yes (uses cost function)Not useful Bayesian Frequentist

15 15 Bayesianism versus Frequentism “Bayesians address the question everyone is interested in, by using assumptions no-one believes” “Frequentists use impeccable logic to deal with an issue of no interest to anyone”

16 16 Differences Bayes or Frequentism? Background Specific Astrophysics issues Time dependence Spatial structures Correlations Non-parametric methods Visualisation Cosmic variance Blind analyses `

17 17 What Astrophysicists do well Glorious pictures Sharing data Making data publicly available Dealing with large data sets Visualisation Funding for Astrostatistics Statistical software

18 18 Whirlpool Galaxy Width of Z 0 3 light neutrinos

19 19 What Astrophysicists do well Glorious pictures Sharing data Making data publicly available Dealing with large data sets Visualisation Funding for Astrostatistics Statistical software

20 20 What Particle Physicists now know  (ln L ) = 0.5 rule Unbinned L max and Goodness of fit Prob (data | hypothesis) ≠ Prob (hypothesis | data) Comparing 2 hypotheses Δ(c 2 ) ≠ c 2 Bounded parameters: Feldman and Cousins Use correct L (Punzi effect) Blind analyses

21 21 Δln L = -1/2 rule If L (μ) is Gaussian, following definitions of σ are equivalent: 1) RMS of L ( µ ) 2) 1/√(-d 2 L /d µ 2 ) 3) ln( L (μ±σ) = ln( L (μ 0 )) -1/2 If L (μ) is non-Gaussian, these are no longer the same “ Procedure 3) above still gives interval that contains the true value of parameter μ with 68% probability ” Heinrich: CDF note 6438 (see CDF Statistics Committee Web-page) Barlow: Phystat05

22 22 COVERAGE How often does quoted range for parameter include param’s true value? N.B. Coverage is a property of METHOD, not of a particular exptl result Coverage can vary with Study coverage of different methods of Poisson parameter, from observation of number of events n Hope for: Nominal value 100%

23 23 COVERAGE If true for all : “correct coverage” P< for some “undercoverage” (this is serious !) P> for some “overcoverage” Conservative Loss of rejection power

24 24 Coverage : L approach (Not frequentist) P(n, μ) = e -μ μ n /n! (Joel Heinrich CDF note 6438) -2 lnλ< 1 λ = P(n,μ)/P(n,μ best ) UNDERCOVERS

25 25 Frequentist central intervals, NEVER undercover (Conservative at both ends)

26 26  = (n- µ) 2 /µ Δ = 0.1 24.8% coverage? NOT frequentist : Coverage = 0%  100%

27 27 Great?Good?Bad L max Frequency Unbinned L max and Goodness of Fit? Find params by maximising L So larger L better than smaller L So L max gives Goodness of Fit?? Monte Carlo distribution of unbinned L max

28 28 Not necessarily: pdf L (data,params) fixed vary L Contrast pdf (data,params) param vary fixed data e.g. p(t|λ) = λ e -λt Max at t = 0 Max at λ =1/ t p L t λ Unbinned L max and Goodness of Fit?

29 29 Example 1 Fit exponential to times t 1, t 2, t 3 ……. [Joel Heinrich, CDF 5639] L = ln L max = -N(1 + ln t av ) i.e. Depends only on AVERAGE t, but is INDEPENDENT OF DISTRIBUTION OF t (except for……..) (Average t is a sufficient statistic) Variation of L max in Monte Carlo is due to variations in samples’ average t, but NOT TO BETTER OR WORSE FIT pdf Same average t same L max t L max and Goodness of Fit?

30 30 Example 2 L max and Goodness of Fit? 1 + α cos 2 θ L = cos θ pdf (and likelihood) depends only on cos 2 θ i Insensitive to sign of cos θ i So data can be in very bad agreement with expected distribution e.g. All data with cos θ < 0 and L max does not know about it. Example of general principle

31 31 Example 3 Fit to Gaussian with variable μ, fixed σ ln L max = N(-0.5 ln 2π – ln σ) – 0.5 Σ(x i – x av ) 2 / σ 2 constant ~variance(x) i.e. L max depends only on variance(x), which is not relevant for fitting μ (μ est = x av ) Smaller than expected variance(x) results in larger L max x Worse fit, larger L max Better fit, lower L max L max and Goodness of Fit?

32 32 Transformation properties of pdf and L Lifetime example: dn/dt = λ e – λt Change observable from t to y = √t dn/dy = (dn/dt) (dt/dy) = 2 y λ exp(–λy 2 ) So (a) pdf CHANGES but (b) i.e. corresponding integrals of pdf are INVARIANT

33 33 Now for L ikelihood When parameter changes from λ to τ = 1/λ (a’) L does not change dn/dt = 1/ τ exp{-t/τ} and so L ( τ;t) = L (λ=1/τ;t) because identical numbers occur in evaluations of the two L ’s BUT (b’) So it is NOT meaningful to integrate L (However,………)

34 34 CONCLUSION: NOT recognised statistical procedure [Metric dependent: τ range agrees with τ pred λ range inconsistent with 1/τ pred ] BUT 1) Could regard as “black box” 2) Make respectable by L Bayes’ posterior Posterior( λ) ~ L ( λ)* Prior( λ) [and Prior(λ) can be constant]

35 35 pdf(t;λ) L (λ;t) Value of function Changes when observable is transformed INVARIANT wrt transformation of parameter Integral of function INVARIANT wrt transformation of observable Changes when param is transformed ConclusionMax prob density not very sensible Integrating L not very sensible

36 36 P (Data;Theory) P (Theory;Data) Theory = male or female Data = pregnant or not pregnant P (pregnant ; female) ~ 3%

37 37 P (Data;Theory) P (Theory;Data) Theory = male or female Data = pregnant or not pregnant P (pregnant ; female) ~ 3% but P (female ; pregnant) >>>3%

38 38 P (Data;Theory) P (Theory;Data) HIGGS SEARCH at CERN Is data consistent with Standard Model? or with Standard Model + Higgs? End of Sept 2000 Data not very consistent with S.M. Prob (Data ; S.M.) < 1% valid frequentist statement Turned by the press into: Prob (S.M. ; Data) < 1% and therefore Prob (Higgs ; Data) > 99% i.e. “It is almost certain that the Higgs has been seen”

39 39 p-value ≠ Prob of hypothesis being correct Given data and H0 = null hypothesis, Construct statistic T (e.g. χ 2 ) p-value = probability {T  t observed }, assuming H0 = true If p = 10 -3, what is prob that H0 = true? e.g. Try to identify μ in beam (H0: particle = μ) with π contam. Prob (H0) depends on a) similarity of μ and π masses b) relative populations of μ and π If N(π) ~ N(μ), prob(H0)  0.5 If N(π)  N(μ), prob(H0) ~ 1 0 p 1 If N(π) ~10*N(μ), prob(H0) ~ 0.1 i.e. prob(H0) varies with p-value, but is not equal to it

40 40 p-value ≠ Prob of hypothesis being correct After Conference Banquet speech: “Of those results that have been quoted as significant at the 99% level, about half have turned out to be wrong!” Supposed to be funny, but in fact is perfectly OK

41 41 PARADOX Histogram with 100 bins Fit 1 parameter S min : χ 2 with NDF = 99 (Expected χ 2 = 99 ± 14) For our data, S min (p 0 ) = 90 Is p 1 acceptable if S(p 1 ) = 115? 1)YES. Very acceptable χ 2 probability 2) NO. σ p from S(p 0 +σ p ) = S min +1 = 91 But S(p 1 ) – S(p 0 ) = 25 So p 1 is 5σ away from best value

42 42

43 43

44 44

45 45

46 46 Comparing data with different hypotheses

47 47 χ 2 with ν degrees of freedom? 1)ν = data – free parameters ? Why asymptotic (apart from Poisson  Gaussian) ? a) Fit flatish histogram with y = N {1 + 10 -6 cos(x-x 0 )} x 0 = free param b) Neutrino oscillations: almost degenerate parameters y ~ 1 – A sin 2 (1.27 Δm 2 L/E) 2 parameters 1 – A (1.27 Δm 2 L/E) 2 1 parameter Small Δm 2

48 48 χ 2 with ν degrees of freedom? 2) Is difference in χ 2 distributed as χ 2 ? H0 is true. Also fit with H1 with k extra params e. g. Look for Gaussian peak on top of smooth background y = C(x) + A exp{-0.5 ((x-x 0 )/σ) 2 } Is χ 2 H0 - χ 2 H1 distributed as χ 2 with ν = k = 3 ? Relevant for assessing whether enhancement in data is just a statistical fluctuation, or something more interesting N.B. Under H0 (y = C(x)) : A=0 (boundary of physical region) x 0 and σ undefined

49 49 Is difference in χ 2 distributed as χ 2 ? Demortier: H0 = quadratic bgd H1 =……… + Gaussian of fixed width Protassov, van Dyk, Connors, …. H0 = continuum (a)H1 = narrow emission line (b)H1 = wider emission line (c)H1 = absorption line Nominal significance level = 5%

50 50 So need to determine the Δχ 2 distribution by Monte Carlo N.B. 1)Determining Δχ 2 for hypothesis H1 when data is generated according to H0 is not trivial, because there will be lots of local minima 2) If we are interested in 5σ significance level, needs lots of MC simulations Is difference in χ 2 distributed as χ 2 ?

51 51

52 52

53 53

54 54

55 55

56 56 X obs = -2 Now gives upper limit

57 57

58 58

59 59 Getting L wrong: Punzi effect Giovanni Punzi @ PHYSTAT2003 “Comments on L fits with variable resolution” Separate two close signals, when resolution σ varies event by event, and is different for 2 signals e.g. 1) Signal 1 1+cos 2 θ Signal 2 Isotropic and different parts of detector give different σ 2) M (or τ) Different numbers of tracks  different σ M (or σ τ )

60 60 Events characterised by x i and  σ i A events centred on x = 0 B events centred on x = 1 L (f) wrong = Π [f * G(x i,0,σ i ) + (1-f) * G(x i,1,σ i )] L (f) right = Π [f*p(x i,σ i ;A) + (1-f) * p(x i,σ i ;B)] p(S,T) = p(S|T) * p(T) p(x i,σ i |A) = p(x i |σ i,A) * p(σ i |A) = G(x i,0,σ i ) * p(σ i |A) So L (f) right = Π[f * G(x i,0,σ i ) * p(σ i |A) + (1-f) * G(x i,1,σ i ) * p(σ i |B)] If p(σ|A) = p(σ|B), L right = L wrong but NOT otherwise Punzi Effect

61 61 Punzi’s Monte Carlo for A : G(x,0,   ) B : G(x,1   ) f A = 1/3 L wrong L right       f A  f  f A  f 1. 0 1. 0 0. 336(3) 0. 08 Same 1. 01. 1 0. 374(4) 0. 08 0. 333(0) 0  1. 02. 0  0. 645(6) 0. 12 0. 333(0)  0  1  2 1.5  3 0. 514(7) 0. 14 0. 335(2) 0. 03  1.0 1  2 0.482(9) 0.09 0.333(0) 0  1) L wrong OK for p    p   , but otherwise BIASSED  2) L right unbiassed, but L wrong biassed (enormously)!  3) L right gives smaller σ f than L wrong Punzi Effect

62 62 Explanation of Punzi bias σ A = 1 σ B = 2 A events with σ = 1 B events with σ = 2 x  x  ACTUAL DISTRIBUTION FITTING FUNCTION [N A /N B variable, but same for A and B events] Fit gives upward bias for N A /N B because (i) that is much better for A events; and (ii) it does not hurt too much for B events

63 63 Another scenario for Punzi problem: PID A B π K M TOF Originally: Positions of peaks = constant K-peak  π-peak at large momentum σ i variable, (σ i ) A ≠ (σ i ) B σ i ~ constant, p K ≠ p π COMMON FEATURE: Separation / Error ≠ Constant Where else?? MORAL: Beware of event-by-event variables whose pdf’s do not appear in L

64 64 Avoiding Punzi Bias Include p(σ|A) and p(σ|B) in fit (But then, for example, particle identification may be determined more by momentum distribution than by PID) OR Fit each range of σ i separately, and add (N A ) i  (N A ) total, and similarly for B Incorrect method using L wrong uses weighted average of (f A ) j, assumed to be independent of j Talk by Catastini at PHYSTAT05

65 65 BLIND ANALYSES Why blind analysis? Methods of blinding Study procedure with simulation only Add random number to result * Look at only first fraction of data Keep the signal box closed Keep MC parameters hidden Keep fraction visible for each bin hidden After analysis is unblinded, …….. * Luis Alvarez suggestion re “discovery” of free quarks

66 66 Conclusions Common problems: scope for learning from each other Large data sets Separating signal Testing models Signal / Discovery Targetted Workshops e.g. SAMSI: Jan  May 2006 Banff: July 2006 (Limits with nuisance params; significance tests: signal-bgd sepn.) Summer schools: Spain, FNAL,…… Thanks to Stefen Lauritzen, Lance Miller, Subir Sarkar, Roberto Trotta. ………

67 67 Excellent Conference: Very big THANK YOU to Jogesh and Eric !


Download ppt "1 Perspective of SCMA IV from Particle Physics Louis Lyons Particle Physics, Oxford (CDF experiment, Fermilab) SCMA IV Penn State."

Similar presentations


Ads by Google