1 6. Mean, Variance, Moments and Characteristic Functions For a r.v X, its p.d.f represents complete information about it, and for any Borel set B on the.

Slides:



Advertisements
Similar presentations
1 12. Principles of Parameter Estimation The purpose of this lecture is to illustrate the usefulness of the various concepts introduced and studied in.
Advertisements

Math 144 Confidence Interval.
6. Mean, Variance, Moments and Characteristic Functions
INTEGRALS Areas and Distances INTEGRALS In this section, we will learn that: We get the same special type of limit in trying to find the area under.
Ch 5.1: Review of Power Series
Evaluating Hypotheses
16 MULTIPLE INTEGRALS.
Copyright © Cengage Learning. All rights reserved. 6 Point Estimation.
Lehrstuhl für Informatik 2 Gabriella Kókai: Maschine Learning 1 Evaluating Hypotheses.
Variance Fall 2003, Math 115B. Basic Idea Tables of values and graphs of the p.m.f.’s of the finite random variables, X and Y, are given in the sheet.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 7 Statistical Intervals Based on a Single Sample.
1 10. Joint Moments and Joint Characteristic Functions Following section 6, in this section we shall introduce various parameters to compactly represent.
Physics 114: Lecture 15 Probability Tests & Linear Fitting Dale E. Gary NJIT Physics Department.
Ch 8.1 Numerical Methods: The Euler or Tangent Line Method
HAWKES LEARNING SYSTEMS math courseware specialists Copyright © 2010 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Chapter 8 Continuous.
1 As we have seen in section 4 conditional probability density functions are useful to update the information about an event based on the knowledge about.
1 1. Basics Probability theory deals with the study of random phenomena, which under repeated experiments yield different outcomes that have certain underlying.
PROBABILITY & STATISTICAL INFERENCE LECTURE 3 MSc in Computing (Data Analytics)
Copyright © Cengage Learning. All rights reserved. 6 Normal Probability Distributions.
PROBABILITY AND STATISTICS FOR ENGINEERING Hossein Sameti Department of Computer Engineering Sharif University of Technology Two Functions of Two Random.
1 7. Two Random Variables In many experiments, the observations are expressible not as a single quantity, but as a family of quantities. For example to.
1 5. Functions of a Random Variable Let X be a r.v defined on the model and suppose g(x) is a function of the variable x. Define Is Y necessarily a r.v?
1 2. Independence and Bernoulli Trials Independence: Events A and B are independent if It is easy to show that A, B independent implies are all independent.
Continuous Distributions The Uniform distribution from a to b.
Integrals  In Chapter 2, we used the tangent and velocity problems to introduce the derivative—the central idea in differential calculus.  In much the.
Boyce/DiPrima 9 th ed, Ch 5.1: Review of Power Series Elementary Differential Equations and Boundary Value Problems, 9 th edition, by William E. Boyce.
1 One Function of Two Random Variables Given two random variables X and Y and a function g(x,y), we form a new random variable Z as Given the joint p.d.f.
1 Let X represent a Binomial r.v as in (3-42). Then from (2-30) Since the binomial coefficient grows quite rapidly with n, it is difficult to compute (4-1)
Multiple Random Variables Two Discrete Random Variables –Joint pmf –Marginal pmf Two Continuous Random Variables –Joint Distribution (PDF) –Joint Density.
1 Two Functions of Two Random Variables In the spirit of the previous lecture, let us look at an immediate generalization: Suppose X and Y are two random.
PROBABILITY AND STATISTICS FOR ENGINEERING Hossein Sameti Department of Computer Engineering Sharif University of Technology Principles of Parameter Estimation.
1 Functions of a Random Variable Let X be a r.v defined on the model and suppose g(x) is a function of the variable x. Define Is Y necessarily a r.v? If.
PROBABILITY AND STATISTICS FOR ENGINEERING Hossein Sameti Department of Computer Engineering Sharif University of Technology Two Random Variables.
PROBABILITY AND STATISTICS FOR ENGINEERING Hossein Sameti Department of Computer Engineering Sharif University of Technology Mean, Variance, Moments and.
1 8. One Function of Two Random Variables Given two random variables X and Y and a function g(x,y), we form a new random variable Z as Given the joint.
1 3. Random Variables Let ( , F, P) be a probability model for an experiment, and X a function that maps every to a unique point the set of real numbers.
1 3. Random Variables Let ( , F, P) be a probability model for an experiment, and X a function that maps every to a unique point the set of real numbers.
Expectation. Let X denote a discrete random variable with probability function p(x) (probability density function f(x) if X is continuous) then the expected.
For starters - pick up the file pebmass.PDW from the H:Drive. Put it on your G:/Drive and open this sheet in PsiPlot.
1 Introduction to Statistics − Day 4 Glen Cowan Lecture 1 Probability Random variables, probability densities, etc. Lecture 2 Brief catalogue of probability.
1 6. Mean, Variance, Moments and Characteristic Functions For a r.v X, its p.d.f represents complete information about it, and for any Borel set B on the.
Probability and Moment Approximations using Limit Theorems.
Joint Moments and Joint Characteristic Functions.
One Function of Two Random Variables
CHAPTER 2.3 PROBABILITY DISTRIBUTIONS. 2.3 GAUSSIAN OR NORMAL ERROR DISTRIBUTION  The Gaussian distribution is an approximation to the binomial distribution.
CHAPTER – 1 UNCERTAINTIES IN MEASUREMENTS. 1.3 PARENT AND SAMPLE DISTRIBUTIONS  If we make a measurement x i in of a quantity x, we expect our observation.
Copyright © Cengage Learning. All rights reserved. 5 Joint Probability Distributions and Random Samples.
G. Cowan Lectures on Statistical Data Analysis Lecture 9 page 1 Statistical Data Analysis: Lecture 9 1Probability, Bayes’ theorem 2Random variables and.
Expectations of Continuous Random Variables. Introduction The expected value E[X] for a continuous RV is motivated from the analogous definition for a.
1 Two Discrete Random Variables The probability mass function (pmf) of a single discrete rv X specifies how much probability mass is placed on each possible.
CHAPTER- 3.2 ERROR ANALYSIS. 3.3 SPECIFIC ERROR FORMULAS  The expressions of Equations (3.13) and (3.14) were derived for the general relationship of.
THE NORMAL DISTRIBUTION
INTEGRALS 5. INTEGRALS In Chapter 3, we used the tangent and velocity problems to introduce the derivative—the central idea in differential calculus.
12. Principles of Parameter Estimation
Sampling Distributions and Estimation
3.1 Expectation Expectation Example
Mean & Variance of a Distribution
7. Two Random Variables In many experiments, the observations are expressible not as a single quantity, but as a family of quantities. For example to record.
5. Functions of a Random Variable
3. Random Variables Let (, F, P) be a probability model for an experiment, and X a function that maps every to a unique point.
11. Conditional Density Functions and Conditional Expected Values
8. One Function of Two Random Variables
7. Two Random Variables In many experiments, the observations are expressible not as a single quantity, but as a family of quantities. For example to record.
11. Conditional Density Functions and Conditional Expected Values
9. Two Functions of Two Random Variables
12. Principles of Parameter Estimation
7. Two Random Variables In many experiments, the observations are expressible not as a single quantity, but as a family of quantities. For example to record.
Berlin Chen Department of Computer Science & Information Engineering
8. One Function of Two Random Variables
Continuous Distributions
Presentation transcript:

1 6. Mean, Variance, Moments and Characteristic Functions For a r.v X, its p.d.f represents complete information about it, and for any Borel set B on the x-axis Note that represents very detailed information, and quite often it is desirable to characterize the r.v in terms of its average behavior. In this context, we will introduce two parameters - mean and variance - that are universally used to represent the overall properties of the r.v and its p.d.f. (6-1)

2 Mean or the Expected Value of a r.v X is defined as If X is a discrete-type r.v, then using (3-25) we get Mean represents the average (mean) value of the r.v in a very large number of trials. For example if  then using (3-31), is the midpoint of the interval (a,b). (6-2) (6-3) (6-4)

3 On the other hand if X is exponential with parameter as in (3-32), then implying that the parameter in (3-32) represents the mean value of the exponential r.v. Similarly if X is Poisson with parameter as in (3-43), using (6-3), we get Thus the parameter in (3-43) also represents the mean of the Poisson r.v. (6-5) (6-6)

4 In a similar manner, if X is binomial as in (3-42), then its mean is given by Thus np represents the mean of the binomial r.v in (3-42). For the normal r.v in (3-29), (6-7) (6-8)

5 Thus the first parameter in  is infact the mean of the Gaussian r.v X. Given  suppose defines a new r.v with p.d.f Then from the previous discussion, the new r.v Y has a mean given by (see (6-2)) From (6-9), it appears that to determine we need to determine However this is not the case if only is the quantity of interest. Recall that for any y, where represent the multiple solutions of the equation But(6-10) can be rewritten as (6-9) (6-10) (6-11)

6 where the terms form nonoverlapping intervals. Hence and hence as  y covers the entire y-axis, the corresponding  x’s are nonoverlapping, and they cover the entire x-axis. Hence, in the limit as integrating both sides of (6- 12), we get the useful formula In the discrete case, (6-13) reduces to From (6-13)-(6-14), is not required to evaluate for We can use (6-14) to determine the mean of where X is a Poisson r.v. Using (3-43) (6-12) (6-13) (6-14)

7 (6-15) In general, is known as the kth moment of r.v X. Thus if  its second moment is given by (6-15).

8 Mean alone will not be able to truly represent the p.d.f of any r.v. To illustrate this, consider the following scenario: Consider two Gaussian r.vs  and  Both of them have the same mean However, as Fig. 6.1 shows, their p.d.fs are quite different. One is more concentrated around the mean, whereas the other one has a wider spread. Clearly, we need atleast an additional parameter to measure this spread around the mean! Fig.6.1 (a) (b)

9 For a r.v X with mean represents the deviation of the r.v from its mean. Since this deviation can be either positive or negative, consider the quantity and its average value represents the average mean square deviation of X around its mean. Define With and using (6-13) we get is known as the variance of the r.v X, and its square root is known as the standard deviation of X. Note that the standard deviation represents the root mean square spread of the r.v X around its mean (6-16) (6-17)

10 Expanding (6-17) and using the linearity of the integrals, we get Alternatively, we can use (6-18) to compute Thus, for example, returning back to the Poisson r.v in (3- 43), using (6-6) and (6-15), we get Thus for a Poisson r.v, mean and variance are both equal to its parameter (6-18) (6-19)

11 To determine the variance of the normal r.v we can use (6-16). Thus from (3-29) To simplify (6-20), we can make use of the identity for a normal p.d.f. This gives Differentiating both sides of (6-21) with respect to we get or (6-20) (6-21) (6-22)

12 which represents the in (6-20). Thus for a normal r.v as in (3-29) and the second parameter in infact represents the variance of the Gaussian r.v. As Fig. 6.1 shows the larger the the larger the spread of the p.d.f around its mean. Thus as the variance of a r.v tends to zero, it will begin to concentrate more and more around the mean ultimately behaving like a constant. Moments: As remarked earlier, in general are known as the moments of the r.v X, and (6-23) (6-24)

13 (6-25) are known as the central moments of X. Clearly, the mean and the variance It is easy to relate and Infact In general, the quantities are known as the generalized moments of X about a, and are known as the absolute moments of X. (6-26) (6-27) (6-28)

14 For example, if  then it can be shown that Direct use of (6-2), (6-13) or (6-14) is often a tedious procedure to compute the mean and variance, and in this context, the notion of the characteristic function can be quite helpful. Characteristic Function The characteristic function of a r.v X is defined as (6-29) (6-30)

15 Thus and for all For discrete r.vs the characteristic function reduces to Thus for example, if  as in (3-43), then its characteristic function is given by Similarly, if X is a binomial r.v as in (3-42), its characteristic function is given by (6-31) (6-32) (6-33) (6-34)

16 To illustrate the usefulness of the characteristic function of a r.v in computing its moments, first it is necessary to derive the relationship between them. Towards this, from (6-31) Taking the first derivative of (6-35) with respect to , and letting it to be equal to zero, we get Similarly, the second derivative of (6-35) gives (6-35) (6-36) (6-37)

17 and repeating this procedure k times, we obtain the kth moment of X to be We can use (6-36)-(6-38) to compute the mean, variance and other higher order moments of any r.v X. For example, if  then from (6-33) so that from (6-36) which agrees with (6-6). Differentiating (6-39) one more time, we get (6-38) (6-39) (6-40)

18 (6-41) so that from (6-37) which again agrees with (6-15). Notice that compared to the tedious calculations in (6-6) and (6-15), the efforts involved in (6-39) and (6-41) are very minimal. We can use the characteristic function of the binomial r.v B(n, p) in (6-34) to obtain its variance. Direct differentiation of (6-34) gives so that from (6-36), as in (6-7). (6-42) (6-43)

19 One more differentiation of (6-43) yields and using (6-37), we obtain the second moment of the binomial r.v to be Together with (6-7), (6-18) and (6-45), we obtain the variance of the binomial r.v to be To obtain the characteristic function of the Gaussian r.v, we can make use of (6-31). Thus if  then (6-44) (6-45) (6-46)

20 (6-47) Notice that the characteristic function of a Gaussian r.v itself has the “Gaussian” bell shape. Thus if  then and (6-48) (6-49)

21 (b) (a) Fig. 6.2 From Fig. 6.2, the reverse roles of in and are noteworthy In some cases, mean and variance may not exist. For example, consider the Cauchy r.v defined in (3-38). With clearly diverges to infinity. Similarly (6-50)

22 To compute (6-51), let us examine its one sided factor With indicating that the double sided integral in (6-51) does not converge and is undefined. From (6-50)-(6-52), the mean and variance of a Cauchy r.v are undefined. We conclude this section with a bound that estimates the dispersion of the r.v beyond a certain interval centered around its mean. Since measures the dispersion of (6-51) (6-52)

23 the r.v X around its mean , we expect this bound to depend on as well. Chebychev Inequality Consider an interval of width 2  symmetrically centered around its mean  as in Fig What is the probability that X falls outside this interval? We need (6-53) Fig. 6.3

24 To compute this probability, we can start with the definition of From (6-54), we obtain the desired probability to be and (6-55) is known as the chebychev inequality. Interestingly, to compute the above probability bound the knowledge of is not necessary. We only need the variance of the r.v. In particular with in (6-55) we obtain (6-54) (6-55) (6-56)

25 Thus with we get the probability of X being outside the 3  interval around its mean to be for any r.v. Obviously this cannot be a tight bound as it includes all r.vs. For example, in the case of a Gaussian r.v, from Table 4.1 which is much tighter than that given by (6-56). Chebychev inequality always underestimates the exact probability. (6-57)