1 Multivariate Distributions ch4. 2 Multivariable Distributions  It may be favorable to take more than one measurement on a random experiment. –The data.

Slides:



Advertisements
Similar presentations
MOMENT GENERATING FUNCTION AND STATISTICAL DISTRIBUTIONS
Advertisements

Chapter 2 Multivariate Distributions Math 6203 Fall 2009 Instructor: Ayona Chatterjee.
1. (a) (b) The random variables X 1 and X 2 are independent, and each has p.m.f.f(x) = (x + 2) / 6 if x = –1, 0, 1. Find E(X 1 + X 2 ). E(X 1 ) = E(X 2.
Chapter 5 Discrete Random Variables and Probability Distributions
Chapter 2 Discrete Random Variables
Lecture note 6 Continuous Random Variables and Probability distribution.
Chapter 2: Probability Random Variable (r.v.) is a variable whose value is unknown until it is observed. The value of a random variable results from an.
Review of Basic Probability and Statistics
Chapter 4 Discrete Random Variables and Probability Distributions
Chapter 1 Probability Theory (i) : One Random Variable
Probability Review 1 CS479/679 Pattern Recognition Dr. George Bebis.
Normal Distribution ch5.
Probability Densities
Today Today: More of Chapter 2 Reading: –Assignment #2 is up on the web site – –Please read Chapter 2 –Suggested.
Discrete Random Variables and Probability Distributions
Probability Distributions Finite Random Variables.
Class notes for ISE 201 San Jose State University
1 Review of Probability Theory [Source: Stanford University]
Sections 4.1, 4.2, 4.3 Important Definitions in the Text:
Tch-prob1 Chapter 4. Multiple Random Variables Ex Select a student’s name from an urn. S In some random experiments, a number of different quantities.
1 Multivariable Distributions ch4. 2  It may be favorable to take more than one measurement on a random experiment.  The data may then be collected.
Continuous Random Variables and Probability Distributions
Chapter 4: Joint and Conditional Distributions
1 Sampling Distribution Theory ch6. 2  Two independent R.V.s have the joint p.m.f. = the product of individual p.m.f.s.  Ex6.1-1: X1is the number of.
Random Variable and Probability Distribution
Lecture II-2: Probability Review
Chapter 21 Random Variables Discrete: Bernoulli, Binomial, Geometric, Poisson Continuous: Uniform, Exponential, Gamma, Normal Expectation & Variance, Joint.
NIPRL Chapter 2. Random Variables 2.1 Discrete Random Variables 2.2 Continuous Random Variables 2.3 The Expectation of a Random Variable 2.4 The Variance.
Lecture 28 Dr. MUMTAZ AHMED MTH 161: Introduction To Statistics.
Joint Distribution of two or More Random Variables
Chapter6 Jointly Distributed Random Variables
Sampling Distributions  A statistic is random in value … it changes from sample to sample.  The probability distribution of a statistic is called a sampling.
Chapter 1 Probability and Distributions Math 6203 Fall 2009 Instructor: Ayona Chatterjee.
Section 8 – Joint, Marginal, and Conditional Distributions.
Chapter 5 Discrete Random Variables and Probability Distributions ©
PROBABILITY & STATISTICAL INFERENCE LECTURE 3 MSc in Computing (Data Analytics)
CHAPTER 4 Multiple Random Variable
1 Lecture 4. 2 Random Variables (Discrete) Real-valued functions defined on a sample space are random vars. determined by outcome of experiment, we can.
CPSC 531: Probability Review1 CPSC 531:Probability & Statistics: Review II Instructor: Anirban Mahanti Office: ICT 745
STA347 - week 51 More on Distribution Function The distribution of a random variable X can be determined directly from its cumulative distribution function.
1 Since everything is a reflection of our minds, everything can be changed by our minds.
Expectation for multivariate distributions. Definition Let X 1, X 2, …, X n denote n jointly distributed random variable with joint density function f(x.
Stats Probability Theory Summary. The sample Space, S The sample space, S, for a random phenomena is the set of all possible outcomes.
Probability Review-1 Probability Review. Probability Review-2 Probability Theory Mathematical description of relationships or occurrences that cannot.
Random Variable The outcome of an experiment need not be a number, for example, the outcome when a coin is tossed can be 'heads' or 'tails'. However, we.
Conditional Probability Mass Function. Introduction P[A|B] is the probability of an event A, giving that we know that some other event B has occurred.
Copyright © 2006 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Review of Statistics I: Probability and Probability Distributions.
Chapter 5a:Functions of Random Variables Yang Zhenlin.
Review of Probability. Important Topics 1 Random Variables and Probability Distributions 2 Expected Values, Mean, and Variance 3 Two Random Variables.
1 Probability and Statistical Inference (9th Edition) Chapter 4 Bivariate Distributions November 4, 2015.
Continuous Random Variables and Probability Distributions
Distributions of Functions of Random Variables November 18, 2015
ENEE 324: Conditional Expectation Richard J. La Fall 2004.
Chapter 2: Probability. Section 2.1: Basic Ideas Definition: An experiment is a process that results in an outcome that cannot be predicted in advance.
Copyright © Cengage Learning. All rights reserved. 5 Joint Probability Distributions and Random Samples.
1 Two Discrete Random Variables The probability mass function (pmf) of a single discrete rv X specifies how much probability mass is placed on each possible.
Copyright © Cengage Learning. All rights reserved. 5 Joint Probability Distributions and Random Samples.
Chapter 5 Joint Probability Distributions and Random Samples  Jointly Distributed Random Variables.2 - Expected Values, Covariance, and Correlation.3.
1. 2 At the end of the lesson, students will be able to (c)Understand the Binomial distribution B(n,p) (d) find the mean and variance of Binomial distribution.
Theory of Computational Complexity Probability and Computing Ryosuke Sasanuma Iwama and Ito lab M1.
Chapter 4 Discrete Random Variables and Probability Distributions
Random variables (r.v.) Random variable
3.1 Expectation Expectation Example
Conditional Probability on a joint discrete distribution
Monte Carlo Approximations – Introduction
How accurately can you (1) predict Y from X, and (2) predict X from Y?
EMIS 7300 SYSTEMS ANALYSIS METHODS FALL 2005
Chapter 2. Random Variables
Discrete Random Variables and Probability Distributions
Presentation transcript:

1 Multivariate Distributions ch4

2 Multivariable Distributions  It may be favorable to take more than one measurement on a random experiment. –The data may then be collected in pairs of (x i, y i ).  Def.4.1-1: X & Y are two discrete R.V. defined over the support S. The probability that X=x, Y=y is denoted as f(x,y)=P(X=x,Y=y). f(x,y) is the joint probability mass function (joint p.m.f.) of X and Y: –0≤f(x,y)≤1; ΣΣ (x,y) ∈ S f(x,y)=1; P[(X,Y) ∈ A]=ΣΣ (x,y) ∈ A f(x,y), A ⊆ S.

3 Illustration Example  Ex.4.1-3: Roll a pair of dice: X is the smaller and Y is the larger. –The outcome is (3, 2) or (2, 3) ⇒ X=2 & Y=3 with 2/36 probability. –The outcome is (2, 2) ⇒ X=2 & Y=2 with 1/36 probability. –Thus, the joint p.m.f. of X and Y is 11/3662/36 1/36 9/3652/36 1/36 7/3642/36 1/36 5/3632/36 1/36 3/3622/361/ /369/367/365/363/361/36 Marginal p.m.f. y x

4 Marginal Probability and Independence  Def.4.1-2: X and Y have the joint p.m.f f(x,y) with space S. –The marginal p.m.f. of X is f 1 (x)=Σ y f(x,y)=P(X=x), x ∈ S 1. –The marginal p.m.f. of Y is f 2 (y)=Σ x f(x,y)=P(Y=y), y ∈ S 2.  X and Y are independent iff P(X=x, Y=y)=P(X=x)P(Y=y), namely, f(x,y)=f 1 (x)f 2 (y), x ∈ S 1, y ∈ S 2. –Otherwise, X and Y are dependent.  X and Y in Ex5.1-3 are dependent: 1/36=f(1,1) ≠ f 1 (1)f 2 (1)=11/36*1/36.  Ex4.1-4: The joint p.m.f. f(x,y)=(x+y)/21, x=1,2,3, y=1,2. –Then, f 1 (x)=Σ y=1~2 (x+y)/21=(2x+3)/21, x=1,2,3. –Likewise, f 2 (1)=Σ x=1~3 (x+y)/21=(6+3y)/21, y=1,2. –Since f(x,y)≠f 1 (x)f 2 (y), X and Y are dependent.  Ex4.1-6: f(x,y)=xy 2 /13, (x,y)=(1,1),(1,2),(2,2).

5 Quick Dependence Checks  Practically, “ dependence ” can be quickly determined if –The support of X and Y is NOT rectangular, or  S is therefore not the product set {(x,y): x ∈ S 1, y ∈ S 2 }, as in Ex –f(x,y) cannot be factored (separated) into the product of an x-alone expression and a pure y function.  In Ex4.1-4, f(x,y) is a sum, not a product, of x-alone and y-alone functions.  Ex4.1-7: [Probability Histogram for a joint p.m.f.]

6 Mathematical Expectation  If u(X 1,X 2 ) is a function of two R.V. X 1 & X 2, then if it exists, is called the mathematical expectation (or expected value) of u(X 1,X 2 ). –The mean of Xi, i=1,2: –The variance of X i :  Ex4.1-8: A player selects a chip from a bowl having 8 chips: 3 marked (0,0), 2 (1,0), 2 (0,1), 1 (1,1).

7 Probability Density Function Joint  Joint Probability Density Function, joint p.d.f., of two continuous-type R.V. X & Y, is an integrable function f(x,y): –f(x,y)≥0; ∫ y=-∞~∞ ∫ x=-∞~∞ f(x,y)dxdy=1; –P[(X,Y) ∈ A]=∫∫ A f(x,y)dxdy, for an event A.  Ex4.1-9: X and Y have the joint p.d.f. –A={(x,y): 0<x<1, 0<y<x}. –The respective marginal p.d.f.s are X and Y are independent!

8 Independence of Continuous Type R.V.s  Two continuous type R.V. X and Y are independent iff the joint p.d.f. factors into the product of their marginal p.d.f.s.  Ex4.1-10: X and Y have the joint p.d.f. –The support S={(x,y): 0≤x≤y≤1}, bounded by x=0, y=1, x=y lines. –The marginal p.d.f.s are –Various expected values: X and Y are dependent!

9 Multivariate Hypergeometric Distribution  Ex4.1-11: Of 200 students, 40 have As, 60 Bs; 100 Cs, Ds, or Fs. –A sample of size 25 is taken at random without replacement.  X 1 is the number of A students, X 2 is the number of B students, and  25 – X 1 – X 2 is the number of the other students. –The space S = {(x 1,x 2 ): x 1,x 2 ≥0, x 1 +x 2 ≤25}. –The marginal p.m.f. of X 1 can be also obtained as: X 1 and X 2 are dependent! From the knowledge of the model.

10 Binomial ⇒ Trinomial Distribution  Trinomial Distribution: The experiment is repeated n times. –The probability p 1 : perfect, p 2 : second; p 3 : defective, p 3 =1-p 1 -p 2. –X 1 : the number of perfect items, X 2 for second, X 3 for defective. –The joint p.m.f. is –X 1 is b(n,p 1 ), X 2 is b(n,p 2 ); both are dependent.  Ex4.1-13: In manufacturing a certain item, –95% of the items are good; 4% are “ seconds ”, and 1% defective. –An inspector observes n=20 items selected at random, counting the number X of seconds, and the number Y of defectives. –The probability that at least 2 seconds or at least 2 defective items are found, namely A={(x,y): x≥2 or y≥2}, is

11 Correlation Coefficient  For two R.V. X 1 & X 2, –The mean of X i, i=1,2: –The variance of X i : –The covariance of X 1 & X 2 is –The correlation coefficient of X 1 & X 2 is  Ex4.2-1: X 1 & X 2 have the joint p.m.f. → Not a product ⇒ Dependent!

12 Insights of the Meaning of ρ  Among all points in S, ρ tends to be positive if more points are simultaneously above or below their respective means with larger probability.  The least-squares regression line is a line passing given (μ x,μ y ) with the best slope b s.t. K(b)=E{[(Y-μ y )-b(X-μ x )] 2 } is minimized. –The square of the vertical distance from a point to the line. –ρ= ±1: K(b)=0 ⇒ all the points lie on the least-squares regression line. –ρ= 0: K(b)=σ y 2, the line is y=μ y ; X and Y could be independent!!  ρmeasures the amount of linearity in the probability distribution.

13 Example  Ex4.2-2: Roll a pair of 4-sided die: X is the number of ones, Y is the number of twos and threes. –The joint p.m.f. is –The line of best fit is

14 Independence ⇒ ρ=0  The converse is not necessarily true!  Ex4.2-3: The joint p.m.f. of X and Y is f(x,y)=1/3, (x,y)=(0,1), (1,0), (2,1). –Obviously, the support is not “ rectangular ”, so X and Y are dependent.  Empirical Data: from n bivariate observations: (x i,y i ), i=1..n. –We can compute the sample mean and variance for each variate. –We can also compute the sample correlation coefficient and the sample least squares regression line. (Ref. p.241) ∵ independence

15 Conditional Distributions  Def.4.3-1: The conditional probability mass function of X, given that Y=y, is defined by g(x|y)=f(x,y)/f 2 (y), if f 2 (y)>0. –Likewise, h(y|x)=f(x,y)/f 1 (x), if f 1 (x)>0.  Ex.4.3-1: X and Y have the joint p.m.f f(x,y)=(x+y)/21, x=1,2,3; y=1,2. –f 1 (x)=(2x+3)/21, x=1,2,3; f 2 (y)=(3y+6)/21, y=1,2. –Thus, given Y=y, the conditional p.m.f. of X is –When y=1, g(x|1)=(x+1)/9, x=1,2,3; g(1|1):g(2|1):g(3|1)=2:3:4. –When y=2, g(x|2)=(x+2)/12, x=1,2,3; g(1|2):g(2|2):g(3|2)=3:4:5. –Similar relationships about h(y|x) can be obtained. Dependent!

16 Conditional Mean and Variance  The conditional mean of Y, given X=x, is  The conditional variance of Y, given X=x, is  Ex.4.3-2: [from Ex.4.3-1] X and Y have the joint p.m.f f(x,y)=(x+y)/21, x=1,2,3; y=1,2.

17 Relationship about Conditional Mean  The point (μ X,μ Y ) locates on the above two lines, and is their junction.  The product of the slopes is ρ 2.  The ratio of the slopes is These relations can derive the unknown from the others known.

18 Example  Ex.4.3-3: X and Y have the trinomial p.m.f. with n, p 1, p 2, p 3 =1-p 1 -p 2 –They have the marginal p.m.f. b(n, p 1 ), b(n, p 2 ), so

19 Example for Continuous-type R.V.  Ex4.3-5: [From Ex4.1-10] ⇒ The conditional distribution of Y given X=x is U(x,1). [U(a,b) has mean (b+a)/2, and variance (b-a) 2 /12.]

20 Transformations of R.V.s  In Section 3.5, the transformation of a single variable X with f(x) to another Y=v(X), an increasing or decreasing fn, can be done as:  Ex.4.4-1: X: b(n,p), Y=X 2, if n=3, p=1/4, then –What is the transformation u(X/n) leading to a variance free of p? Taylor ’ s expansion about p:  Ex: X: b(100,1/4) or b(100,9/10). Continuous type Discrete type When the variance is constant, or free of p,

21 Multivariate Transformations  When the function Y=u(X) does not have a single-valued inverse, it needs to consider possible inverse functions individually. –Each range will be delimited to match the right inverse.  For multivariate, the derivative is replaced by the Jacobian. –Continuous R.V. X 1 and X 2 have the joint p.d.f. f(x 1, x 2 ). –If has the single-valued inverse then the joint p.d.f. of Y 1 and Y 2 is –[Most difficult] The mapping of the supports are considered.

22 Transformation to the Independent  Ex4.4-2: X 1 and X 2 have the joint p.d.f. f(x 1, x 2 )=2, 0<x 1 <x 2 <1. –Consider Y 1 =X 1 /X 2, Y 2 =X 2 : –The mapping of the supports: –The marginal p.d.f.: – ∵ g(y 1,y 2 )=g 1 (y 1 )g 2 (y 2 ) ∴ Y 1,Y 2 Independent. →

23 Transformation to the Dependent  Ex4.4-3: X 1 and X 2 are indep., each with p.d.f. f(x)=e -x, 0<x<∞. –Their joint p.d.f. f(x 1, x 2 )= e -x1 e -x2, 0<x 1 <∞, 0<x 2 <∞. –Consider Y 1 =X 1 -X 2, Y 2 =X 1 -X 2 : –The mapping of the supports: –The marginal p.d.f.: – ∵ g(y 1,y 2 ) ≠g 1 (y 1 )g 2 (y 2 ) ∴ Y 1,Y 2 Dependent. → Double exponential p.d.f.

24 Beta Distribution  Ex4.4-4: X 1 and X 2 have indep. Gamma distributions withα,θ and β, θ. Their joint p.d.f. is –Consider Y 1 =X 1 /(X 1 +X 2 ), Y 2 =X 1 +X 2 : i.e., X 1 =Y 1 Y 2, X 2 =Y 2 -Y 1 Y 2. –The marginal p.d.f.: – ∵ g(y 1,y 2 )=g 1 (y 1 )g 2 (y 2 ) ∴ Y 1,Y 2 Independent. Beta p.d.f. Gamma p.d.f.

25 Another Example  Ex.4.4-5: U: χ 2 (r 1 ) and V: χ 2 (r 2 ) are independent. –The joint p.d.f. of Z and U is  The knowledge of known distributions and their associated integration relationships are useful to derive the distributions of unknown distributions. χ 2 (r 1 +r 2 )

26  Two independent R.V.s have the joint p.m.f. = the product of individual p.m.f.s. –Ex: X 1 is the number of spots on a fair die.  f 1 (x 1 )=1/6, x 1 =1,2,3,4,5,6.  X 2 is the number of heads on 4 indep. Tosses of a fair coin. –If X 1 and X 2 are indep.  If X 1 and X 2 have the same p.m.f., their joint p.m.f. is f(x 1 )*f(x 2 ). –This collection of X 1 and X 2 is a random sample of size n=2 from f(x). P(X 1 =1,2 & X 2 =3,4)

27 Linear Functions of Indep. R.V.s  Suppose a function Y=X 1 +X 2, S 1 ={1,2,3,4,5,6}, S 2 ={0,1,2,3,4}. –Y will have the support S={1,2, …,9,10}. –The p.m.f. g(y) of Y is  The mathematical expectation (or expected value) of a function Y=u(X 1,X 2 ) is –If X 1 and X 2 are indep. 

28 Example  Ex4.5-1: X 1 and X 2 are two indep. R.V. from casting a die twice. –E(X 1 )=E(X 2 )=3.5; Var(X 1 )=Var(X 2 )=35/12; E(X 1 X 2 )=E(X 1 )E(X 2 )=12.25; –E[(X )(X )]=E(X )E(X )=0. –Y=X 1 +X 2 →E(Y)= E(X 1 )+E(X 2 )=7; –Var(Y)=E[(X 1 +X 2 -7) 2 ]=Var(X 1 )+Var(X 2 )=35/6.  The p.m.f. g(y) of Y with S={2,3,4, …,12} is

29 General Cases  If X 1, …,X n are indep., then their joint p.d.f. is f 1 (x 1 ) … f n (x n ). –The expected value of the product u 1 (x 1 ) … u n (x n ) is the product of the expected values of u 1 (x 1 ), …, u n (x n ).  If all these n distributions are the same, the collection of n indep. and identically distributed (iid) random variables, X 1, …,X n, is a random sample of size n from that common distribution.  Ex4.5-2: X 1, X 2, X 3, are a random sample from a distribution with p.d.f. f(x)=e -x, 0<x<∞. –The joint p.d.f. is P(0 < X 1 < 1,2 < X 2 < 4,3 < X 3 < 7)

30 Distributions of Sums of Indep. R.V.s  Distributions of the product of indep. R.V.s are straightforward.  However, distributions of the sum of indep. R.V.s are fetching: –First, the joint p.m.f. or p.d.f. is a simple product. –However, through summation, these R.V.s interfere with each other.  Care must be taken to distinguish some sum value happens more frequently than the others.  Sampling distribution theory is to derive the distributions of the functions of R.V.s (random variables). –The sample mean and variance are famous functions. –The summation of R.V.s is another example.

31 Example  Ex: X 1 and X 2 are two indep. R.V.s from casting a 4-sided die twice. –The p.m.f. f(x)=1/4, x=1,2,3,4. –The p.m.f. of Y=X 1 +X 2 with S={2,3,4,5,6,7,8} is g(y): (convolution formula) g(2)= g(3)=  g(y)=

32 Theorems  Thm4.5-1: X 1, …,X n are indep. and have the joint p.m.f. is f 1 (x 1 ) … f n (x n ). Y=u(X 0, …,X n ) have the p.m.f. g(y) –Then if the summations exist. –For continuous type, integrals replace the summations.  Thm4.5-2: X 1, …,Xn are indep. and their means exist, –Then,  Thm4.6-1: If X 1, …,X n are indep. with means μ 1, …,μ n and variances σ 1 2, …,σ n 2, then Y=a 1 X 1 + … +a n X n, where a i ’ s are real constants, have the mean and variance:  Ex4.6-1: X 1 & X 2 are indep. with μ 1 = -4, μ 2 =3 and σ 1 2 =4, σ 2 2 =9. –Y=3X 1 -2X 2 has

33 Moment-generating Functions  Ex4.6-2: X 1, …,X n are a random sample of size n from a distribution with mean μand variance σ 2 ; then –The sample mean:  Thm4.6-2: If X 1, …,X n are indep. R.V.s with moment- generating functions, i=1..n, then Y=a 1 X 1 + … +a n X n, has the moment-generating  Cly4.6-1: If X 1, …,X n are indep. R.V.s with M(t), –then Y=X 1 + … +X n has M Y (t)=[M(t)] n. – has M Y (t)

34 Examples  Ex4.6-3: X 1, …,X n are the outcomes on n Bernoulli trials. –The moment-generating function of X i, i=1..n, is M(t)=q+pe t. –Then Y=X 1 + … +X n has M Y (t)=[q+pe t ] n, which is b(n,p).  Ex4.6-4: X 1,X 2,X 3 are the outcomes of a random sample of size n=3 from the exponential distribution with mean θand M(t)=1/(1-θt), t<1/θ. –Then Y=X 1 +X 2 +X 3 has M Y (t)=[1/(1-θt)] 3 =(1-θt) -3, which is a gamma distribution with α=3, and θ. – has a gamma distribution with α=3, and θ/3.

35  Thm4.6-3: X 1, …,X n are independent and have χ 2 (r 1 ), …, χ 2 (r n ) distributions, respectively ; Then, Y=X 1 + … +X n is χ 2 (r 1 + … +r n ).  Pf: M Y (t)=