Uniovi1 Some distributions Distribution/pdfExample use in HEP BinomialBranching ratio MultinomialHistogram with fixed N PoissonNumber of events found UniformMonte.

Slides:



Advertisements
Similar presentations
Computing and Statistical Data Analysis / Stat 4
Advertisements

Week11 Parameter, Statistic and Random Samples A parameter is a number that describes the population. It is a fixed number, but in practice we do not know.
Outline input analysis input analyzer of ARENA parameter estimation
Statistics review of basic probability and statistics.
1 Methods of Experimental Particle Physics Alexei Safonov Lecture #21.
Review of Basic Probability and Statistics
G. Cowan Lectures on Statistical Data Analysis 1 Statistical Data Analysis: Lecture 10 1Probability, Bayes’ theorem, random variables, pdfs 2Functions.
Probability Densities
Simulation Modeling and Analysis
Statistics.
Tch-prob1 Chapter 4. Multiple Random Variables Ex Select a student’s name from an urn. S In some random experiments, a number of different quantities.
K. Desch – Statistical methods of data analysis SS10
7. Least squares 7.1 Method of least squares K. Desch – Statistical methods of data analysis SS10 Another important method to estimate parameters Connection.
G. Cowan 2011 CERN Summer Student Lectures on Statistics / Lecture 41 Introduction to Statistics − Day 4 Lecture 1 Probability Random variables, probability.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 14 Goodness-of-Fit Tests and Categorical Data Analysis.
A) Transformation method (for continuous distributions) U(0,1) : uniform distribution f(x) : arbitrary distribution f(x) dx = U(0,1)(u) du When inverse.
G. Cowan Lectures on Statistical Data Analysis Lecture 10 page 1 Statistical Data Analysis: Lecture 10 1Probability, Bayes’ theorem 2Random variables and.
Introduction to Statistics − Day 2
G. Cowan Lectures on Statistical Data Analysis 1 Statistical Data Analysis: Lecture 7 1Probability, Bayes’ theorem, random variables, pdfs 2Functions of.
Lecture II-2: Probability Review
Chapter 21 Random Variables Discrete: Bernoulli, Binomial, Geometric, Poisson Continuous: Uniform, Exponential, Gamma, Normal Expectation & Variance, Joint.
Standard Statistical Distributions Most elementary statistical books provide a survey of commonly used statistical distributions. The reason we study these.
880.P20 Winter 2006 Richard Kass 1 Confidence Intervals and Upper Limits Confidence intervals (CI) are related to confidence limits (CL). To calculate.
1 Probability and Statistics  What is probability?  What is statistics?
DATA ANALYSIS Module Code: CA660 Lecture Block 3.
G. Cowan Lectures on Statistical Data Analysis Lecture 1 page 1 Lectures on Statistical Data Analysis YETI IPPP Durham Young Experimentalists and.
Dept of Bioenvironmental Systems Engineering National Taiwan University Lab for Remote Sensing Hydrology and Spatial Modeling STATISTICS Random Variables.
Modeling and Simulation CS 313
Statistics for Engineer Week II and Week III: Random Variables and Probability Distribution.
G. Cowan 2009 CERN Summer Student Lectures on Statistics1 Introduction to Statistics − Day 4 Lecture 1 Probability Random variables, probability densities,
G. Cowan Lectures on Statistical Data Analysis Lecture 3 page 1 Lecture 3 1 Probability (90 min.) Definition, Bayes’ theorem, probability densities and.
1 Statistical Distribution Fitting Dr. Jason Merrick.
Chapter 7 Random-Number Generation
CS433: Modeling and Simulation Dr. Anis Koubâa Al-Imam Mohammad bin Saud University 15 October 2010 Lecture 05: Statistical Analysis Tools.
Module 1: Statistical Issues in Micro simulation Paul Sousa.
LECTURER PROF.Dr. DEMIR BAYKA AUTOMOTIVE ENGINEERING LABORATORY I.
G. Cowan iSTEP 2014, Beijing / Statistics for Particle Physics / Lecture 11 Statistical Methods for Particle Physics Lecture 1: intro, parameter estimation,
880.P20 Winter 2006 Richard Kass Binomial Probability Distribution For the binomial distribution P is the probability of m successes out of N trials. Here.
1 SMU EMIS 7364 NTU TO-570-N Inferences About Process Quality Updated: 2/3/04 Statistical Quality Control Dr. Jerrell T. Stracener, SAE Fellow.
G. Cowan Statistical Methods in Particle Physics1 Statistical Methods in Particle Physics Day 1: Introduction 清华大学高能物理研究中心 2010 年 4 月 12—16 日 Glen Cowan.
G. Cowan Computing and Statistical Data Analysis / Stat 2 1 Computing and Statistical Data Analysis Stat 2: Catalogue of pdfs London Postgraduate Lectures.
G. Cowan Statistical Data Analysis / Stat 1 1 Statistical Data Analysis 2015/16 London Postgraduate Lectures on Particle Physics; University of London.
Lecture 4: Statistics Review II Date: 9/5/02  Hypothesis tests: power  Estimation: likelihood, moment estimation, least square  Statistical properties.
G. Cowan Lectures on Statistical Data Analysis Lecture 2 page 1 Lecture 2 1 Probability Definition, Bayes’ theorem, probability densities and their properties,
Stats Probability Theory Summary. The sample Space, S The sample space, S, for a random phenomena is the set of all possible outcomes.
Lecture 3: Statistics Review I Date: 9/3/02  Distributions  Likelihood  Hypothesis tests.
G. Cowan Lectures on Statistical Data Analysis Lecture 1 page 1 Lectures on Statistical Data Analysis RWTH Aachen Graduiertenkolleg February, 2007.
G. Cowan Weizmann Statistics Workshop, 2015 / GDC Lecture 11 Statistical Methods for Particle Physics Lecture 1: introduction to frequentist statistics.
Computer simulation Sep. 9, QUIZ 2 Determine whether the following experiments have discrete or continuous out comes A fair die is tossed and the.
G. Cowan Lectures on Statistical Data Analysis Lecture 4 page 1 Statistical Data Analysis: Lecture 4 1Probability, Bayes’ theorem 2Random variables and.
1 Introduction to Statistics − Day 4 Glen Cowan Lecture 1 Probability Random variables, probability densities, etc. Lecture 2 Brief catalogue of probability.
G. Cowan Lectures on Statistical Data Analysis Lecture 8 page 1 Statistical Data Analysis: Lecture 8 1Probability, Bayes’ theorem 2Random variables and.
1 Introduction to Statistics − Day 3 Glen Cowan Lecture 1 Probability Random variables, probability densities, etc. Brief catalogue of probability densities.
Chapter 5 Joint Probability Distributions and Random Samples  Jointly Distributed Random Variables.2 - Expected Values, Covariance, and Correlation.3.
STA347 - week 91 Random Vectors and Matrices A random vector is a vector whose elements are random variables. The collective behavior of a p x 1 random.
1 Introduction to Statistics − Day 2 Glen Cowan Lecture 1 Probability Random variables, probability densities, etc. Brief catalogue of probability densities.
Gil McVean, Department of Statistics Thursday February 12 th 2009 Monte Carlo simulation.
G. Cowan Lectures on Statistical Data Analysis Lecture 6 page 1 Statistical Data Analysis: Lecture 6 1Probability, Bayes’ theorem 2Random variables and.
G. Cowan Lectures on Statistical Data Analysis Lecture 5 page 1 Statistical Data Analysis: Lecture 5 1Probability, Bayes’ theorem 2Random variables and.
G. Cowan Lectures on Statistical Data Analysis Lecture 10 page 1 Statistical Data Analysis: Lecture 10 1Probability, Bayes’ theorem 2Random variables and.
G. Cowan Aachen 2014 / Statistics for Particle Physics, Lecture 11 Statistical Methods for Particle Physics Lecture 1: probability, random variables, MC.
1 Introduction to Statistical Methods for High Energy Physics Glen Cowan 2005 CERN Summer Student Lectures CERN Summer Student Lectures on Statistics Glen.
Introduction to Statistics − Day 2
Modeling and Simulation CS 313
Lectures on Statistical Data Analysis
Outline Introduction Signal, random variable, random process and spectra Analog modulation Analog to digital conversion Digital transmission through baseband.
Computing and Statistical Data Analysis Stat 3: The Monte Carlo Method
Statistics for Particle Physics Lecture 1: Fundamentals
Computing and Statistical Data Analysis / Stat 7
1/2555 สมศักดิ์ ศิวดำรงพงศ์
Presentation transcript:

Uniovi1 Some distributions Distribution/pdfExample use in HEP BinomialBranching ratio MultinomialHistogram with fixed N PoissonNumber of events found UniformMonte Carlo method ExponentialDecay time GaussianMeasurement error Chi-squareGoodness-of-fit Cauchy Mass of resonance LandauIonization energy loss

Uniovi2 Binomial distribution Consider N independent experiments (Bernoulli trials): outcome of each is ‘success’ or ‘failure’, probability of success on any given trial is p. Define discrete r.v. n = number of successes (0 ≤ n ≤ N). Probability of a specific outcome (in order), e.g. ‘ssfsf’ is But order not important; there are ways (permutations) to get n successes in N trials, total probability for n is sum of probabilities for each permutation.

Uniovi3 Binomial distribution (2) The binomial distribution is therefore random variable parameters For the expectation value and variance we find:

Uniovi4 Binomial distribution (3) Binomial distribution for several values of the parameters: Example: observe N decays of W ±, the number n of which are W→  is a binomial r.v., p = branching ratio.

Uniovi5 Multinomial distribution Like binomial but now m outcomes instead of two, probabilities are For N trials we want the probability to obtain: n 1 of outcome 1, n 2 of outcome 2,  n m of outcome m. This is the multinomial distribution for

Uniovi6 Multinomial distribution (2) Now consider outcome i as ‘success’, all others as ‘failure’. → all n i individually binomial with parameters N, p i for all i One can also find the covariance to be Example: represents a histogram with m bins, N total entries, all entries independent.  ij =1 i=j  ij = 0 i≠j

Uniovi7 Poisson distribution Consider binomial n in the limit → n follows the Poisson distribution: Example: number of scattering events n with cross section  found for a fixed integrated luminosity, with

Uniovi8 Uniform distribution Consider a continuous r.v. x with  ∞ < x < ∞. Uniform pdf is: N.B. For any r.v. x with cumulative distribution F(x), y = F(x) is uniform in [0,1]. Example: for  0 → , E  is uniform in [E min, E max ], with 2

Uniovi9 Exponential distribution The exponential pdf for the continuous r.v. x is defined by: Example: proper decay time t of an unstable particle (  = mean lifetime) Lack of memory (unique to exponential):

Uniovi10 Gaussian distribution The Gaussian (normal) pdf for a continuous r.v. x is defined by: Special case:  = 0,  2 = 1 (‘standard Gaussian’): (N.B. often ,  2 denote mean, variance of any r.v., not only Gaussian.) If y ~ Gaussian with ,  2, then x = (y   ) /  follows  (x).

Uniovi11 Gaussian pdf and the Central Limit Theorem The Gaussian pdf is so useful because almost any random variable that is a sum of a large number of small contributions follows it. This follows from the Central Limit Theorem: For n independent r.v.s x i with finite variances  i 2, otherwise arbitrary pdfs, consider the sum Measurement errors are often the sum of many contributions, so frequently measured values can be treated as Gaussian r.v.s. In the limit n → ∞, y is a Gaussian r.v. with

Uniovi12 Central Limit Theorem (2) The CLT can be proved using characteristic functions (Fourier transforms), see, e.g., SDA Chapter 10. Good example: velocity component v x of air molecules. OK example: total deflection due to multiple Coulomb scattering. (Rare large angle deflections give non-Gaussian tail.) Bad example: energy loss of charged particle traversing thin gas layer. (Rare collisions make up large fraction of energy loss, cf. Landau pdf.) For finite n, the theorem is approximately valid to the extent that the fluctuation of the sum is not dominated by one (or few) terms. Beware of measurement errors with non-Gaussian tails.

Uniovi13 Multivariate Gaussian distribution Multivariate Gaussian pdf for the vector are column vectors, are transpose (row) vectors, For n = 2 this is where  = cov[x 1, x 2 ]/(  1  2 ) is the correlation coefficient.

Uniovi14 Chi-square (  2 ) distribution The chi-square pdf for the continuous r.v. z (z ≥ 0) is defined by n = 1, 2,... = number of ‘degrees of freedom’ (dof) For independent Gaussian x i, i = 1,..., n, means  i, variances  i 2, follows  2 pdf with n dof. Example: goodness-of-fit test variable especially in conjunction with method of least squares.

Fits and ndof Suppose we have a set of N independent measurements, x i, assumed to be unbiased measurements of the same unknown quantity μ with a common,but unknown,variance σ 2.Then Are efficient estimators of  and   if the x i are Gaussian Consider a set of N independent measurements y i at known points x i.The measurement y i is assumed to be Gaussian distributed with mean F(x i ;  ) and known variance σ i 2 The goal is to construct estimators for the unknown parameters. The set of parameters which maximize L is the same as those which minimize χ2: Least Squares

Fits and ndof (2) The χ2 test provides a measure of the significance of a discrepancy between the data and the hypothesized functional form (F) used in the fit. One expects in a “reasonable” experiment to obtain χ2 ≈ ndof. Hence the quantity χ2/ndof is sometimes reported, however, one must report ndof as well if one wishes to determine the p-value (probability of observing a test statistic at least as extreme as the one obtained) p n Statistically significant

Uniovi17 Cauchy (Breit-Wigner) distribution The Breit-Wigner pdf for the continuous r.v. x is defined by  = 2, x 0 = 0 is the Cauchy pdf.) E[x] not well defined, V[x] →∞. x 0 = mode (most probable value)  = full width at half maximum Example: mass of resonance particle, e.g. , K *,  0,...  = decay rate (inverse of mean lifetime)

Uniovi18 Landau distribution For a charged particle with  = v /c traversing a layer of matter of thickness d, the energy loss  follows the Landau pdf: L. Landau, J. Phys. USSR 8 (1944) 201; see also W. Allison and J. Cobb, Ann. Rev. Nucl. Part. Sci. 30 (1980) 253.    d 

Uniovi19 Landau distribution (2) Long ‘Landau tail’ → all moments Mode (most probable value) sensitive to , → particle i.d.

Uniovi20 What it is: a numerical technique for calculating probabilities and related quantities using sequences of random numbers. The usual steps: (1) Generate sequence r 1, r 2,..., r m uniform in [0, 1]. (2) Use this to produce another sequence x 1, x 2,..., x n distributed according to some pdf f (x) in which we’re interested (x can be a vector). (3) Use the x values to estimate some property of f (x), e.g., fraction of x values with a < x < b gives → MC calculation = integration (at least formally) MC generated values = ‘simulated data’ → use for testing statistical procedures The Monte Carlo method

Uniovi21 Monte Carlo event generators Simple example: e  e  →     Generate cos  and  : Less simple: ‘event generators’ for a variety of reactions: e + e - →    , hadrons,... pp → hadrons, D-Y, SUSY,... e.g. PYTHIA, HERWIG, ISAJET... Output = ‘events’, i.e., for each event we get a list of generated particles and their momentum vectors, types, etc.

Uniovi22 A simulated event PYTHIA Monte Carlo pp → gluino-gluino

Uniovi23 Monte Carlo detector simulation Takes as input the particle list and momenta from generator. Simulates detector response: multiple Coulomb scattering (generate scattering angle), particle decays (generate lifetime), ionization energy loss (generate  ), electromagnetic, hadronic showers, production of signals, electronics response,... Output = simulated raw data → input to reconstruction software: track finding, fitting, etc. Predict what you should see at ‘detector level’ given a certain hypothesis for ‘generator level’. Compare with the real data. Estimate ‘efficiencies’ = #events found / # events generated. Programming package: GEANT

Uniovi24 Random number generators Goal: generate uniformly distributed values in [0, 1]. Toss coin for e.g. 32 bit number... (too tiring). → ‘random number generator’ = computer algorithm to generate r 1, r 2,..., r n. Example: multiplicative linear congruential generator (MLCG) n i+1 = (a n i ) mod m, where n i = integer a = multiplier m = modulus n 0 = seed (initial value) N.B. mod = modulus (remainder), e.g. 27 mod 5 = 2. This rule produces a sequence of numbers n 0, n 1,...

Uniovi25 Random number generators (2) The sequence is (unfortunately) periodic! Example (see Brandt Ch 4): a = 3, m = 7, n 0 = 1 ← sequence repeats Choose a, m to obtain long period (maximum = m  1); m usually close to the largest integer that can represented in the computer. Only use a subset of a single period of the sequence.

Uniovi26 Random number generators (3) are in [0, 1] but are they ‘random’? Choose a, m so that the r i pass various tests of randomness: uniform distribution in [0, 1], all values independent (no correlations between pairs), e.g. L’Ecuyer, Commun. ACM 31 (1988) 742 suggests a = m = Far better algorithms available, e.g. TRandom3, period See F. James, Comp. Phys. Comm. 60 (1990) 111; Brandt Ch. 4

Uniovi27 The transformation method Given r 1, r 2,..., r n uniform in [0, 1], find x 1, x 2,..., x n that follow f (x) by finding a suitable transformation x (r). Require: i.e. That is, setand solve for x (r).

Uniovi28 Example of the transformation method Exponential pdf: Set and solve for x (r). → works too.)

Uniovi29 The acceptance-rejection method Enclose the pdf in a box: (1) Generate a random number x, uniform in [x min, x max ], i.e. r 1 is uniform in [0,1]. (2) Generate a 2nd independent random number u uniformly distributed between 0 and f max, i.e. (3) If u < f (x), then accept x. If not, reject x and repeat.

Uniovi30 Example with acceptance-rejection method If dot below curve, use x value in histogram.