2.3 Estimating PDFs and PDF Parameters

Slides:

Advertisements

Similar presentations

POINT ESTIMATION AND INTERVAL ESTIMATION

Advertisements

Point estimation, interval estimation

Maximum likelihood Conditional distribution and likelihood Maximum likelihood estimations Information in the data and likelihood Observed and Fisher’s.

Evaluating Hypotheses

Continuous Random Variables and Probability Distributions

Lehrstuhl für Informatik 2 Gabriella Kókai: Maschine Learning 1 Evaluating Hypotheses.

Maximum likelihood (ML)

Inferential Statistics

Chapter 7 Sampling Distributions

Chapter 7 Estimation: Single Population

Estimation Basic Concepts & Estimation of Proportions

Investment Analysis and Portfolio management Lecture: 24 Course Code: MBF702.

HAWKES LEARNING SYSTEMS math courseware specialists Copyright © 2010 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Chapter 8 Continuous.

Modeling and Simulation CS 313

Random Sampling, Point Estimation and Maximum Likelihood.

Theory of Probability Statistics for Business and Economics.

LECTURER PROF.Dr. DEMIR BAYKA AUTOMOTIVE ENGINEERING LABORATORY I.

BINOMIALDISTRIBUTION AND ITS APPLICATION. Binomial Distribution  The binomial probability density function –f(x) = n C x p x q n-x for x=0,1,2,3…,n for.

Chapter 5 Parameter estimation. What is sample inference? Distinguish between managerial & financial accounting. Understand how managers can use accounting.

Math b (Discrete) Random Variables, Binomial Distribution.

1 2 nd Pre-Lab Quiz 3 rd Pre-Lab Quiz 4 th Pre-Lab Quiz.

Stats Probability Theory Summary. The sample Space, S The sample space, S, for a random phenomena is the set of all possible outcomes.

PROBABILITY AND STATISTICS FOR ENGINEERING Hossein Sameti Department of Computer Engineering Sharif University of Technology Mean, Variance, Moments and.

Chapter 7 Point Estimation of Parameters. Learning Objectives Explain the general concepts of estimating Explain important properties of point estimators.

Expectation. Let X denote a discrete random variable with probability function p(x) (probability density function f(x) if X is continuous) then the expected.

Probability Theory Modelling random phenomena. Permutations the number of ways that you can order n objects is: n! = n(n-1)(n-2)(n-3)…(3)(2)(1) Definition:

1 6. Mean, Variance, Moments and Characteristic Functions For a r.v X, its p.d.f represents complete information about it, and for any Borel set B on the.

Continuous Random Variables and Probability Distributions

Measurements and Their Analysis. Introduction Note that in this chapter, we are talking about multiple measurements of the same quantity Numerical analysis.

Chapter 9 Sampling Distributions 9.1 Sampling Distributions.

Virtual University of Pakistan

Sampling Distributions

CHAPTER 6 Random Variables

Lecture Slides Elementary Statistics Eleventh Edition

Chapter 7: Sampling Distributions

Chapter 5 Sampling Distributions

Chapter 5 Sampling Distributions

Chapter 9: Sampling Distributions

3.1 Sums of Random Variables probability of z = x + y

Lecture Slides Elementary Statistics Twelfth Edition

2.0 Probability Concepts definitions: randomness, parent population, random variable, probability, statistical independence, probability of multiple events,

5.1 Introduction to Curve Fitting why do we fit data to a function?

3.0 Functions of One Random Variable

Lecture Slides Elementary Statistics Twelfth Edition

Lecture 1 Cameron Kaplan

Computing and Statistical Data Analysis / Stat 7

What Is a Sampling Distribution?

2.1 Properties of PDFs mode median expectation values moments mean

Chapter 7: Sampling Distributions

Chapter 7: Sampling Distributions

CHAPTER 7 Sampling Distributions

Test Drop Rules: If not:

Chapter 7: Sampling Distributions

Chapter 7: Sampling Distributions

Chapter 7: Sampling Distributions

AP Statistics Chapter 16 Notes.

Chapter 9: Sampling Distributions

Chapter 7: Sampling Distributions

Chapter 7: Sampling Distributions

Chapter 7: Sampling Distributions

Chapter 7: Sampling Distributions

Chapter 7: Sampling Distributions

Lecture Slides Essentials of Statistics 5th Edition

The Practice of Statistics – For AP* STARNES, YATES, MOORE

Chapter 7: Sampling Distributions

Chapter 7: Sampling Distributions

Chapter 7: Sampling Distributions

Chapter 7: Sampling Distributions

Lecture Slides Essentials of Statistics 5th Edition

Chapter 5: Sampling Distributions

Presentation transcript:

2.3 Estimating PDFs and PDF Parameters estimating means - discrete and continuous estimating variance using a known mean estimating variance with an estimated mean estimating a discrete pdf estimating a continuous pdf estimating a pdf with a known functional form 2.3 : 1/11

Estimating a Mean with Finite Data Consider the experiment where two dice are rolled and the blue value is subtracted from the red. The experiment is repeated 10 times yielding the following data: {2,0,1,-4,-3,0,-3,3,-2,-1}. Determine the frequency, f(x), of observing each possible outcome. f(-5) = 0 f(-4) = 1 f(-3) = 2 f(-2) = 1 f(-1) = 1 f(0) = 2 f(1) = 1 f(2) = 1 f(3) = 1 f(4) = 0 f(5) = 0 Write the estimated mean, m' , using an estimated probability, p'=f(x)/N, and the expectation value formalism. The true mean, m, will be the value of m' taken in the limit. 2.3 : 2/11

The Arithmetic Average Start with the fraction from the previous page and convert each multiplication into a sum, e.g. (-3)2 = (-3)+(-3). Note that this expression is exactly the same as that obtained from an arithmetic average (data listed in the order measured). With the arithmetic average, the probability of each measured value is estimated as p = 1/N. Taking the average in the limit as N  ∞ is mathematically identical to computing the expectation value of the random variable! Note that the first sum is over the possible outcomes, while the second sum is over the data set. 2.3 : 3/11

Example for Rolling Two Dice How well does the measured average recover the true mean of the pdf? Two dice are rolled with the blue value subtracted from the red value. What is the pdf for the average when different numbers of rolls are used in the computation? (n = 10, 100, 1000; N is 10,000). The uncertainty in the estimation of m is given by the width of the pdf. As the number of replicates (rolls) used to compute the average increases, the width of the pdf decreases. Theory states that the width should decrease a factor of 10 going from 10 rolls to 1000 rolls. This expectation is substantiated by the graphs. 2.3 : 4/11

Estimating Variance It is tempting to employ the same strategy with variance that worked in the limit with the mean. Case 1: the mean is known This approximation works quite well in the limit. Case 2: the mean is estimated by the arithmetic average This does not work. The result is biased because of the uncertainty in the average. The bias is eliminated by multiplying by N/(N-1).* *We will prove this later. 2.3 : 5/11

Example for Rolling Two Dice What is the pdf for the biased and unbiased variance when different numbers of rolls are used in the computation? Note that s2 = 5.83 variance variance variance variance variance variance When the number of rolls is large, the two s2 have a similar pdf i.e. N - 1  N. 2.3 : 6/11

Estimating a Discrete PDF (1) A pdf with an unknown functional form can be estimated by performing a large number of measurements and estimating the probability of each expected outcome. How many measurements are necessary? start by choosing the minimum probability that needs to be estimated and the desired precision of that estimation treat the observation of that outcome as a binomial pdf, where p is the probability of observing the outcome and q is the probability of observing all other outcomes (the binomial parameter, n, will be 1) use the fact that where s is the standard deviation for one trial of the binomial pdf, and is the standard deviation after averaging N trials (note that q  1) 2.3 : 7/11

Estimating a Discrete PDF (2) An initial guess at the required number of trials might be m(1/p) where m is an integer and p is the minimum probability to be estimated. For the example of rolling two dice, -5 and +5 had the smallest probability, 1/36. Use the equation on the previous page and let N = m(1/p). With m = 1 the standard deviation is equal to p (which is too large an uncertainty). Larger values of m will improve the estimate. For the die roll example choose m = 100. This means that 10036 trials need to be made. The graph at the right shows this result. blue is theory and red is the average of 3,600 trials 2.3 : 8/11

Estimating a Continuous PDF (1) Estimation of the shape for a continuous, unknown PDF requires that the data be binned. To demonstrate this, the graph at the right contains data from an exponential pdf that are not binned. The primary difficulty with a continuous random variable is estimating the bin size. To do this compute the minimum observed value and maximum observed value. For small numbers of events, an initial estimate for the number of bins might be one tenth the number of observations. 2.3 : 9/11

Estimating a Continuous PDF (2) Bin width will determine the resolution of the estimated pdf. For large numbers of replicates the resolution, Dt, and precision of the probability, sp, are traded against each other. The following graphs show 1,000 events examined with two bin widths. In the left graph the counts are known to good precision (RSD = 6%), but the resolution is poor, Dt = 1.8 ns. In the right graph the resolution of the pdf is higher, Dt = 0.37 ns, but the precision is worse, RSD = 12% . 2.3 : 10/11

Estimation of a Known Function When the data come from a random process with a known pdf, the shape of the pdf can be estimated using moments. As an example, suppose 20 photons arise from an exponential decay. The average of the observation time for the set of photons can be used as an estimate of t. The following graph shows a 5-ns decay along with the estimated function using two random sets of 20 photons. The red line is the true pdf, while the blue and green lines are computed from the mean. Even for 20 photons the pdf can be estimated much better with moments than using a histogram. This approach requires knowing the functional form of the pdf! 2.3 : 11/11