Presentation is loading. Please wait.

Presentation is loading. Please wait.

Expectation for multivariate distributions. Definition Let X 1, X 2, …, X n denote n jointly distributed random variable with joint density function f(x.

Similar presentations


Presentation on theme: "Expectation for multivariate distributions. Definition Let X 1, X 2, …, X n denote n jointly distributed random variable with joint density function f(x."— Presentation transcript:

1 Expectation for multivariate distributions

2 Definition Let X 1, X 2, …, X n denote n jointly distributed random variable with joint density function f(x 1, x 2, …, x n ) then

3 Example Let X, Y, Z denote 3 jointly distributed random variable with joint density function then Determine E[XYZ].

4 Solution:

5 Some Rules for Expectation

6 Thus you can calculate E[X i ] either from the joint distribution of X 1, …, X n or the marginal distribution of X i. Proof:

7 The Linearity property Proof:

8 In the simple case when k = 2 3.(The Multiplicative property) Suppose X 1, …, X q are independent of X q+1, …, X k then if X and Y are independent

9 Proof:

10

11 Some Rules for Variance

12 Proof Thus

13 Note: If X and Y are independent, then

14 Definition: For any two random variables X and Y then define the correlation coefficient  XY to be: if X and Y are independent

15 Properties of the correlation coefficient  XY The converse is not necessarily true. i.e.  XY = 0 does not imply that X and Y are independent.

16 More properties of the correlation coefficient  XY if there exists a and b such that where  XY = +1 if b > 0 and  XY = -1 if b< 0 Proof: Let Letfor all b. Consider choosing b to minimize

17 Since g(b) ≥ 0, then g(b min ) ≥ 0 or Consider choosing b to minimize

18 Hence g(b min ) ≥ 0 Hence

19 or Note If and only if This will be true if i.e.

20 Summary if there exists a and b such that where

21 Proof Thus

22

23 Some Applications (Rules of Expectation & Variance) Let Let X 1, …, X n be n mutually independent random variables each having mean  and standard deviation  (variance  2 ). Then

24 Also or Thus Hence the distribution of is centered at  and becomes more and more compact about  as n increases

25 Tchebychev’s Inequality

26 Let X denote a random variable with mean  =E(X) and variance Var(X) = E[(X –  ) 2 ] =  2 then Note: Is called the standard deviation of X,

27 Proof:

28

29 Tchebychev’s inequality is very conservative k =1 k = 2 k = 3

30 The Law of Large Numbers

31 Let Let X 1, …, X n be n mutually independent random variables each having mean  Then for any  > 0 (no matter how small)

32 Proof Now We will use Tchebychev’s inequality which states for any random variable X.

33 Thus

34 Thus the Law of Large Numbers states A Special case Let X 1, …, X n be n mutually independent random variables each having Bernoulli distribution with parameter p 

35 Thus the Law of Large Numbers states that Some people misinterpret this to mean that if the proportion of successes is currently lower that p then the proportion of successes in the future will have to be larger than p to counter this and ensure that the Law of Large numbers holds true. Of course if in the infinite future the proportion of successes is p than this is enough to ensure that the Law of Large numbers holds true. converges to the probability of success p

36 Some more applications Rules of expectation and Rules of Variance

37 The mean and variance of a Binomial Random variable We have already computed this by other methods: 1.Using the probability function p(x). 2.Using the moment generating function m X (t). Suppose that we have observed n independent repetitions of a Bernoulli trial  Let X 1, …, X n be n mutually independent random variables each having Bernoulli distribution with parameter p  and defined by

38 Now X = X 1 + … + X n has a Binomial distribution with parameters n and p  X is the total number of successes in the n repetitions.

39 The mean and variance of a Hypergeometric distribution The hypergeometric distribution arises when we sample with replacement n objects from a population of N = a + b objects. The population is divided into to groups (group A and group B). Group A contains a objects while group B contains b objects Let X denote the number of objects in the sample of n that come from group A. The probability function of X is:

40 Then Let X 1, …, X n be n random variables defined by Proof

41 and Therefore

42 Thus

43 and Also We need to also calculate Note:

44 and Thus Note:

45 and Thus

46 with Thus and

47 Thus

48 Thus if X has a hypergeometric distribution with parameters a, b and n then

49 The mean and variance of a Negative Binomial distribution The Negative Binomial distribution arises when we repeat a Bernoulli trial until k successes (S) occur. Then X = the trial on which the k th success occurred. The probability function of X is: Let X 1 = the number of trial on which the 1 st success occurred. and X i = the number of trials after the (i -1) st success on which the i th success occurred (i ≥ 2)

50 X i each have a geometric distribution with parameter p. Then X = X 1 + … + X k and X 1, …, X k are mutually independent

51 Thus if X has a negative binomial distribution with parameters k and p then

52 Multivariate Moments Non-central and Central

53 Definition Let X 1 and X 2 be a jointly distirbuted random variables (discrete or continuous), then for any pair of positive integers (k 1, k 2 ) the joint moment of (X 1, X 2 ) of order (k 1, k 2 ) is defined to be:

54 Definition Let X 1 and X 2 be a jointly distirbuted random variables (discrete or continuous), then for any pair of positive integers (k 1, k 2 ) the joint central moment of (X 1, X 2 ) of order (k 1, k 2 ) is defined to be: where  1 = E [X 1 ] and  2 = E [X 2 ]

55 Note = the covariance of X 1 and X 2. Definition: For any two random variables X and Y then define the correlation coefficient  XY to be:

56 Properties of the correlation coefficient  XY The converse is not necessarily true. i.e.  XY = 0 does not imply that X and Y are independent.

57 More properties of the correlation coefficient if there exists a and b such that where  XY = +1 if b > 0 and  XY = -1 if b< 0

58 Some Rules for Expectation

59 Thus you can calculate E[X i ] either from the joint distribution of X 1, …, X n or the marginal distribution of X i. The Linearity property

60 In the simple case when k = 2 3.(The Multiplicative property) Suppose X 1, …, X q are independent of X q+1, …, X k then if X and Y are independent

61 Some Rules for Variance

62 Note: If X and Y are independent, then

63 Definition: For any two random variables X and Y then define the correlation coefficient  XY to be: if X and Y are independent

64 Proof Thus

65

66 Distribution functions, Moments, Moment generating functions in the Multivariate case

67 The distribution function F(x) This is defined for any random variable, X. F(x) = P[X ≤ x] Properties 1. F(-∞) = 0 and F(∞) = 1. 2. F(x) is non-decreasing (i. e. if x 1 < x 2 then F(x 1 ) ≤ F(x 2 ) ) 3. F(b) – F(a) = P[a < X ≤ b].

68 4.Discrete Random Variables F(x) is a non-decreasing step function with F(x)F(x) p(x)p(x)

69 5. Continuous Random Variables Variables F(x) is a non-decreasing continuous function with F(x)F(x) f(x) slope x To find the probability density function, f(x), one first finds F(x) then

70 The joint distribution function F(x 1, x 2, …, x k ) is defined for k random variables, X 1, X 2, …, X k. F(x 1, x 2, …, x k ) = P[ X 1 ≤ x 1, X 2 ≤ x 2, …, X k ≤ x k ] for k = 2 F(x 1, x 2 ) = P[ X 1 ≤ x 1, X 2 ≤ x 2 ] (x 1, x 2 ) x1x1 x2x2

71 Properties 1. F(x 1, -∞) = F(-∞, x 2 ) = F(-∞, -∞) = 0 2. F(x 1, ∞) = P[ X 1 ≤ x 1, X 2 ≤ ∞] = P[ X 1 ≤ x 1 ] = F 1 (x 1 ) = the marginal cumulative distribution function of X 1 F(∞, ∞) = P[ X 1 ≤ ∞, X 2 ≤ ∞] = 1 = the marginal cumulative distribution function of X 2 F(∞, x 2 ) = P[ X 1 ≤ ∞, X 2 ≤ x 2 ] = P[ X 2 ≤ x 2 ] = F 2 (x 2 )

72 3. F(x 1, x 2 ) is non-decreasing in both the x 1 direction and the x 2 direction. i.e. if a 1 < b 1 if a 2 < b 2 then i. F(a 1, x 2 ) ≤ F(b 1, x 2 ) ii. F(x 1, a 2 ) ≤ F(x 1, b 2 ) iii. F( a 1, a 2 ) ≤ F(b 1, b 2 ) (b 1, b 2 ) x1x1 (b 1, a 2 ) (a 1, a 2 ) (a 1, b 2 ) x2x2

73 4. P[a < X 1 ≤ b, c < X 2 ≤ d] = F(b,d) – F(a,d) – F(b,c) + F(a,c). (b, d) x1x1 (b, c) (a, c) (a, d) x2x2

74 4.Discrete Random Variables F(x 1, x 2 ) is a step surface (x 1, x 2 ) x1x1 x2x2

75 5.Continuous Random Variables F(x 1, x 2 ) is a surface (x 1, x 2 ) x1x1 x2x2

76 Multivariate Moments Non-central and Central

77 Definition Let X 1 and X 2 be a jointly distirbuted random variables (discrete or continuous), then for any pair of positive integers (k 1, k 2 ) the joint moment of (X 1, X 2 ) of order (k 1, k 2 ) is defined to be:

78 Definition Let X 1 and X 2 be a jointly distirbuted random variables (discrete or continuous), then for any pair of positive integers (k 1, k 2 ) the joint central moment of (X 1, X 2 ) of order (k 1, k 2 ) is defined to be: where  1 = E [X 1 ] and  2 = E [X 2 ]

79 Note = the covariance of X 1 and X 2.

80 Multivariate Moment Generating functions

81 Recall The moment generating function

82 Definition Let X 1, X 2, … X k be a jointly distributed random variables (discrete or continuous), then the joint moment generating function is defined to be:

83 Definition Let X 1, X 2, … X k be a jointly distributed random variables (discrete or continuous), then the joint moment generating function is defined to be:

84 Power Series expansion the joint moment generating function (k = 2)


Download ppt "Expectation for multivariate distributions. Definition Let X 1, X 2, …, X n denote n jointly distributed random variable with joint density function f(x."

Similar presentations


Ads by Google