From Small-N to Large Harrison B. Prosper SCMA IV, June Bayesian Methods in Particle Physics: From Small-N to Large Harrison B. Prosper Florida State University SCMA IV June, 2006
From Small-N to Large Harrison B. Prosper SCMA IV, June Outline Measuring Zero Bayesian Fit Finding Needles in Haystacks Summary
Measuring Zero
From Small-N to Large Harrison B. Prosper SCMA IV, June Measuring Zero – 1 In the mid-1980s, an experiment at the Institut Laue Langevin (Grenoble, France) searched for evidence of neutron antineutron oscillations, a characteristic prediction of certain Grand Unified Theories.
From Small-N to Large Harrison B. Prosper SCMA IV, June CRISP Experiment Institut Laue Langevin Magnetic shield Neutron gas on Field-on: -> B off Field-off: -> N
From Small-N to Large Harrison B. Prosper SCMA IV, June Measuring Zero – 2 Count number of signal + background events N. Suppress putative signal and count background events B, independently. Results: N = 3 B = 7 Results: N = 3 B = 7
From Small-N to Large Harrison B. Prosper SCMA IV, June Measuring Zero – 3 Classic 2-Parameter Counting Experiment N ~ Poisson(s+b) B ~ Poisson(b) Infer a statement of form: Pr[s < u(N,B)] ≥ 0.9
From Small-N to Large Harrison B. Prosper SCMA IV, June Measuring Zero – 4 In 1984, no exact solution existed in the particle physics literature! Moreover, calculating exact confidence intervals is, according to Kendal and Stuart, “a matter of very considerable difficulty”
From Small-N to Large Harrison B. Prosper SCMA IV, June Measuring Zero – 5 Exact in what way? Over some ensemble of statements of the form 0 < s < u(N,B) at least 90% of them should be true whatever the true values of s and b. Neyman (1937)
From Small-N to Large Harrison B. Prosper SCMA IV, June Measuring Zero - 6 Tried a Bayesian approach: NNN f(s, b|N) = f(N|s, b) (s, b) / f(N) NN = f(N|s, b) (b|s) (s) / f(N) Step 1. Compute the marginal likelihood f(N|s) = ∫f(N|s, b) (b|s) db Step 2. N f(s|N)= f(N|s) (s) / ∫f(N|s) (s) ds
From Small-N to Large Harrison B. Prosper SCMA IV, June But is there a signal? 1. Hypothesis testing(J. Neyman) H 0 : s = 0 H 1 : s > 0 2. p-value(R.A. Fisher) H 0 : s = 0 3. Decision theory(J.M. Bernardo, 1999) Discrepancy “Distance” between models
Bayesian Fit
From Small-N to Large Harrison B. Prosper SCMA IV, June Bayesian Fit Problem: Given counts Data: N = N 1, N 2,..,N M Signal model: A = A 1, A 2,..,A M Background model: B = B 1, B 2,..,B M where M is number of bins (or pixels) find the admixture of A and B that best matches the observations N.
From Small-N to Large Harrison B. Prosper SCMA IV, June Problem (DØ, 2005) Observations = Background + Signal model model (M)
From Small-N to Large Harrison B. Prosper SCMA IV, June Bayesian Fit - Details Assume model of the form Marginalize over a and b
From Small-N to Large Harrison B. Prosper SCMA IV, June Bayesian Fit – Pr(Model) Moreover,… One can compute f(N|p a, p b ) for different signal models M, in particular, for models M that differ by the value of a single parameter. Then compute the probability of model M Pr(M|N) = ∫dp a ∫dp b f(N|p a, p b, M) (p a,p b |M) (M) / (N)
From Small-N to Large Harrison B. Prosper SCMA IV, June Top quark mass hypothesis (GeV) P(M|N) mass= ± 4.5 GeV signal = 33 ± 8 background= 50.8 ± 8.3 Bayesian Fit – Results (DØ, 1997)
Finding Needles in Haystacks
From Small-N to Large Harrison B. Prosper SCMA IV, June single top quark events 0.88 pb 1.98 pb The Needles
From Small-N to Large Harrison B. Prosper SCMA IV, June W boson events The Haystacks 2700 pb 1 : 1000 signal : noise = 1 : 1000
From Small-N to Large Harrison B. Prosper SCMA IV, June The Needles and the Haystacks
From Small-N to Large Harrison B. Prosper SCMA IV, June Finding Needles - 1 The optimal solution is to compute p(S|x) = p(x|S) p(S) / [p(x|S) p(S) + p(x|B) p(B)] Every signal/noise discrimination method is ultimately an algorithm to approximate p(S|x), or a function thereof.
From Small-N to Large Harrison B. Prosper SCMA IV, June Problem: Given D D = x (= x 1,…x N ),y (= y 1,…y N ) of N labeled events. x are the data, y are the labels. Find A function f(x, w), with parameters w, that approximates p(S|x): www p(w|x, y) = p(x, y|w) p(w) / p(x, y) ww = p(y|x, w) p(x|w) p(w) / p(y|x) p(x) w = p(y|x, w) p(w) / p(y|x) assuming p(x|w) = p(x) Finding Needles - 2
From Small-N to Large Harrison B. Prosper SCMA IV, June Likelihood for classification: www p(y|x, w) = i f(x i, w) y [1 – f(x i, w)] 1-y where y = 0 for background events y = 1 for signal events ww If f(x, w) flexible enough, then maximizing p(y|x, w) with respect to w yields f = p(S|x), asymptotically. Finding Needles - 3
From Small-N to Large Harrison B. Prosper SCMA IV, June However, in a Bayesian calculation it is more natural to average with respect to the posterior density ww f(x|D) = ∫ f(x, w) p(w|D) dw Questions: w 1. Do suitably flexible functions f(x, w) exist? 2. Is there a feasible way to do the integral? Finding Needles - 4
From Small-N to Large Harrison B. Prosper SCMA IV, June Answer 1: Yes! f(x,w) x1x1 x2x2 u, a v, b A neural network is an example of a Kolmogorov function, that is, a function capable of approximating arbitrary mappings f:R n -> R weights The parameters w = (u, a, v, b) are called weights
From Small-N to Large Harrison B. Prosper SCMA IV, June Answer 2: Yes! Computational Method Generate a Markov Chain (MC) of K points {w}, whose stationary density is p(w|D), and average over the stationary part of the chain. Map problem to that of a “particle” moving in a spatially-varying “potential” and use methods of statistical mechanics to generate states (p, w) with probability ~ exp(-H), where H is the “Hamiltonian” H = p 2 + log p(w|D), with “momentum” p.
From Small-N to Large Harrison B. Prosper SCMA IV, June Hybrid Markov Chain Monte Carlo Computational Method… For a fixed H traverse space (p, w) using Hamilton’s equations, which guarantees that all points consistent with H will be visited with equal probability. To allow exploration of states with differing values of H one introduces, periodically, random changes to the momentum p. Software Flexible Bayesian Modeling by Radford Neal
From Small-N to Large Harrison B. Prosper SCMA IV, June Example - Finding SUSY! Transverse momentum spectra Signal: black curve Signal:Noise1:25,000
From Small-N to Large Harrison B. Prosper SCMA IV, June Distribution of f(x|D) beyond 0.9 Assuming L = 10 fb -1 CutSB S/√B 0.991x10 3 2x Signal:Noise 1:20 1:20 Example - Finding SUSY!
From Small-N to Large Harrison B. Prosper SCMA IV, June Summary Bayesian methods have been at the heart of several important results in particle physics. However, there is considerable room for expanding their domain of application. A couple of current issues: Is there a signal? Is the Bernardo approach useful in particle physics? Fitting: Is there a practical (Bayesian?) method to test whether or not an N-dimensional function fits an N-dimensional swarm of points?