Presentation is loading. Please wait.

Presentation is loading. Please wait.

Harrison B. Prosper Workshop on Top Physics, Grenoble Bayesian Statistics in Analysis Harrison B. Prosper Florida State University Workshop on Top Physics:

Similar presentations


Presentation on theme: "Harrison B. Prosper Workshop on Top Physics, Grenoble Bayesian Statistics in Analysis Harrison B. Prosper Florida State University Workshop on Top Physics:"— Presentation transcript:

1 Harrison B. Prosper Workshop on Top Physics, Grenoble Bayesian Statistics in Analysis Harrison B. Prosper Florida State University Workshop on Top Physics: from the TeVatron to the LHC October 19, 2007

2 Harrison B. Prosper Workshop on Top Physics, Grenoble 2 Outline  Introduction  Inference  Model Selection  Summary

3 Harrison B. Prosper Workshop on Top Physics, Grenoble 3 Introduction Blaise Pascal 1670 Thomas Bayes 1763 Pierre Simon de Laplace 1812

4 Harrison B. Prosper Workshop on Top Physics, Grenoble 4 Introduction A B AB Let P(A) and P(B) be probabilities, assigned to statements, or events, A and B and let P(AB) be the probability assigned to the joint statement AB, then the conditional probability of A given B is defined by P(A) is the probability of A without the restriction specified by B. P(A|B) is the probability of A when we restrict to the conditions specified by statement B

5 Harrison B. Prosper Workshop on Top Physics, Grenoble 5 From we deduce immediately Bayes’ Theorem Bayes’ Theorem: Introduction Bayesian statistics is the application of Bayes’ theorem to problems of inference

6 Harrison B. Prosper Workshop on Top Physics, Grenoble Inference

7 7 Inference The Bayesian approach to inference is conceptually simple and always the same: Compute DataModel Pr(Data|Model) Compute ModelDataDataModelModelData Pr(Model|Data) = Pr(Data|Model) Pr(Model)/Pr(Data) Model Pr(Model)is called the prior. It is the probability Model assigned to the Model irrespective of theData DataModel Pr(Data|Model)is called the likelihood ModelData Pr(Model|Data)is called the posterior probability

8 Harrison B. Prosper Workshop on Top Physics, Grenoble 8 posterior density prior density marginalization likelihood In practice, inference is done using the continuous form of Bayes’ theorem:  are the parameters of interest denote all other parameters of the problem, which are referred to as nuisance parameters Inference

9 Harrison B. Prosper Workshop on Top Physics, Grenoble 9 Model Datum Likelihood s is the mean signal count b is the mean background count Task: Infer s, given N Prior information Example – 1

10 Harrison B. Prosper Workshop on Top Physics, Grenoble 10 Apply Bayes’ theorem:  (s,b) is the prior density for s and b, which encodes our prior knowledge of the signal and background means. The encoding is often difficult and can be controversial. priorlikelihoodposterior Example – 1

11 Harrison B. Prosper Workshop on Top Physics, Grenoble 11 First factor the prior Define the marginal likelihood and write the posterior density for the signal as Example – 1

12 Harrison B. Prosper Workshop on Top Physics, Grenoble 12 The Background Prior Density Suppose that the background has been estimated from a Monte Carlo simulation of the background process, yielding B events that pass the cuts. Assume that the probability for the count B is given by P(B| ) = Poisson(B  ), where  is the (unknown) mean count of the Monte Carlo sample. We can infer the value of by applying Bayes’ theorem to the Monte Carlo background experiment Example – 1

13 Harrison B. Prosper Workshop on Top Physics, Grenoble 13 The Background Prior Density Assuming a flat prior prior  ( ) = constant, we find p( |B) = Gamma (, 1, B+1) (= B exp(– )/B!). Often the mean background count b in the real experiment is related to the mean count in the Monte Carlo experiment linearly, b = k, where k is an accurately known scale factor, for example, the ratio of the data to Monte Carlo integrated luminosities. The background can be estimated as follows Example – 1

14 Harrison B. Prosper Workshop on Top Physics, Grenoble 14 The Background Prior Density The posterior density p( |B) now serves as the prior density for the background b in the real experiment  (b) = p( |B), where b = k. We can write and Example – 1

15 Harrison B. Prosper Workshop on Top Physics, Grenoble 15 The calculation of the marginal likelihood yields: Example – 1

16 Harrison B. Prosper Workshop on Top Physics, Grenoble 16 Data partitioned into K bins and modeled by a sum of N sources of strength p. The numbers A are the source distributions for model M. Each M corresponds to a different top signal + background model Example – 2: Top Mass – Run I prior likelihood model posterior

17 Harrison B. Prosper Workshop on Top Physics, Grenoble 130140150160170180190200210220230 0 0.1 0.2 0.3 Probability of Model M Top Quark Mass (GeV/c**2) P(M|d) m top = 173.5 ± 4.5 GeV s = 33 ± 8 events b = 50.8 ± 8.3 events Example – 2: Top Mass – Run I

18 Harrison B. Prosper Workshop on Top Physics, Grenoble 18 To Bin Or Not To Bin  Binned – Pros  Likelihood can be modeled accurately  Bins with low counts can be handled exactly  Statistical uncertainties handled exactly  Binned – Cons  Information loss can be severe  Suffers from the curse of dimensionality

19 Harrison B. Prosper Workshop on Top Physics, Grenoble 19 December 8, 2006- Binned likelihoods do work! To Bin Or Not To Bin

20 Harrison B. Prosper Workshop on Top Physics, Grenoble 20 To Bin Or Not To Bin  Un-Binned – Pros  No loss of information (in principle)  Un-Binned – Cons  Can be difficult to model likelihood accurately. Requires fitting (either parametric or KDE)  Error in likelihood grows approximately linearly with the sample size. So at LHC, large sample sizes could become an issue.

21 Harrison B. Prosper Workshop on Top Physics, Grenoble 21 likelihood Start with the standard binned likelihood over K bins model Un-binned Likelihood Functions

22 Harrison B. Prosper Workshop on Top Physics, Grenoble 22 Make the bins smaller and smaller the likelihood becomes where K is now the number of events and a(x) and b(x) are the effective luminosity and background densities, respectively, and A and B are their integrals Un-binned Likelihood Functions

23 Harrison B. Prosper Workshop on Top Physics, Grenoble 23 The un-binned likelihood function is an example of a marked Poisson likelihood. Each event is marked by the discriminating variable x i, which could be multi-dimensional. The various methods for measuring the top cross section and mass differ in the choice of discriminating variables x. Un-binned Likelihood Functions

24 Harrison B. Prosper Workshop on Top Physics, Grenoble 24 Note: Since the functions a(x) and b(x) have to be modeled, they will depend on sets of modeling parameters  and , respectively. Therefore, in general, the un-binned likelihood function is which must be combined with a prior density to compute the posterior density for the cross section Un-binned Likelihood Functions

25 Harrison B. Prosper Workshop on Top Physics, Grenoble 25 If we write s(x) = a(x) , and S = A  we can re-write the un-binned likelihood function as Computing the Un-binned Likelihood Function Since a likelihood function is defined only to within a scaling by a parameter-independent quantity, we are free to scale it by, for example, the observed distribution d(x)

26 Harrison B. Prosper Workshop on Top Physics, Grenoble 26 One way to approximate the ratio [s(x)+ b(x)]/d(x) is with a neural network function trained with an admixture of data, signal and background in the ratio 2:1:1. If the training can be done accurately enough, the network will approximate n(x) = [s(x)+ b(x)]/[ s(x)+b(x)+d(x)] in which case we can then write Computing the Un-binned Likelihood Function

27 Harrison B. Prosper Workshop on Top Physics, Grenoble Model Selection

28 Harrison B. Prosper Workshop on Top Physics, Grenoble 28 posteriorpriorevidence Model Selection Model selection can also be addressed using Bayes’ theorem. It requires computing where the evidence for model M is defined by

29 Harrison B. Prosper Workshop on Top Physics, Grenoble 29 posterior odds prior odds Bayes factor Model Selection The Bayes Factor, B MN, or any one-to-one function thereof, can be used to choose between two competing models M and N, e.g., signal + background versus background only. However, one must be careful to use proper priors.

30 Harrison B. Prosper Workshop on Top Physics, Grenoble 30 Model 1 Model 2 Model Selection – Example Consider the following two prototypical models The Bayes factor for these models is given by

31 Harrison B. Prosper Workshop on Top Physics, Grenoble 31 Model Selection – Example Calibration of Bayes Factors Consider the quantity (called the Kullback-Leibler divergence) For the simple Poisson models with known signal and background, it is easy to show that For s << b, we get √k(2||1) ≈ s /√b. That is, roughly speaking, for s << b, √ ln B 12 ≈ s /√b

32 Harrison B. Prosper Workshop on Top Physics, Grenoble 32 Summary Bayesian statistics is a well-founded and general framework for thinking about and solving analysis problems, including:  Analysis design  Modeling uncertainty  Parameter estimation  Interval estimation (limit setting)  Model selection  Signal/background discrimination etc. It well worth learning how to think this way!


Download ppt "Harrison B. Prosper Workshop on Top Physics, Grenoble Bayesian Statistics in Analysis Harrison B. Prosper Florida State University Workshop on Top Physics:"

Similar presentations


Ads by Google