1 Variance of RVs Supplementary Notes Prepared by Raymond Wong Presented by Raymond Wong.

Slides:



Advertisements
Similar presentations
Lecture Discrete Probability. 5.3 Bayes’ Theorem We have seen that the following holds: We can write one conditional probability in terms of the.
Advertisements

HUDM4122 Probability and Statistical Inference February 2, 2015.
Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section.
Section 7.4 (partially). Section Summary Expected Value Linearity of Expectations Independent Random Variables.
April 2, 2015Applied Discrete Mathematics Week 8: Advanced Counting 1 Random Variables In some experiments, we would like to assign a numerical value to.
Chapter 2: Probability Random Variable (r.v.) is a variable whose value is unknown until it is observed. The value of a random variable results from an.
Chapter 5 Basic Probability Distributions
Probability Distributions Finite Random Variables.
1 Random Variables Supplementary Notes Prepared by Raymond Wong Presented by Raymond Wong.
Continuous Random Variables and Probability Distributions
Chapter 4: Joint and Conditional Distributions
1 Probability distribution Dr. Deshi Ye College of Computer Science, Zhejiang University
1 Random Variables and Discrete probability Distributions SESSION 2.
Probability Distributions: Finite Random Variables.
Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section.
7.1 Discrete and Continuous Random Variable.  Calculate the probability of a discrete random variable and display in a graph.  Calculate the probability.
Conditional Probability and Independence Target Goals: I can use a tree diagram to describe chance behavior. I can use the general multiplication rule.
Applied Business Forecasting and Regression Analysis Review lecture 2 Randomness and Probability.
COMP 170 L2 L18: Random Variables: Independence and Variance Page 1.
MTH3003 PJJ SEM I 2015/2016.  ASSIGNMENT :25% Assignment 1 (10%) Assignment 2 (15%)  Mid exam :30% Part A (Objective) Part B (Subjective)  Final Exam:
1 Lecture 4. 2 Random Variables (Discrete) Real-valued functions defined on a sample space are random vars. determined by outcome of experiment, we can.
CPSC 531: Probability Review1 CPSC 531:Probability & Statistics: Review II Instructor: Anirban Mahanti Office: ICT 745
Lecture Discrete Probability. 5.3 Bayes’ Theorem We have seen that the following holds: We can write one conditional probability in terms of the.
5.3 Random Variables  Random Variable  Discrete Random Variables  Continuous Random Variables  Normal Distributions as Probability Distributions 1.
BINOMIALDISTRIBUTION AND ITS APPLICATION. Binomial Distribution  The binomial probability density function –f(x) = n C x p x q n-x for x=0,1,2,3…,n for.
MATH 110 Sec 13-4 Lecture: Expected Value The value of items along with the probabilities that they will be stolen over the next year are shown. What can.
Chapter 4 DeGroot & Schervish. Variance Although the mean of a distribution is a useful summary, it does not convey very much information about the distribution.
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Review of Exam I Sections Jiaping Wang Department of Mathematical Science 02/18/2013, Monday.
1 Since everything is a reflection of our minds, everything can be changed by our minds.
Math b (Discrete) Random Variables, Binomial Distribution.
Section Independent Events Objectives: 1.Understand the definition of independent events. 2.Know how to use the Multiplication Rule for Independent.
5-1 Random Variables and Probability Distributions The Binomial Distribution.
Random Variables Presentation 6.. Random Variables A random variable assigns a number (or symbol) to each outcome of a random circumstance. A random variable.
Lecture 8. Random variables Random variables and probability distributions Discrete random variables (Continuous random variables)
Binomial Probabilities IBHL, Y2 - Santowski. (A) Coin Tossing Example Take 2 coins and toss each Make a list to predict the possible outcomes Determine.
The Binomial Distribution.  If a coin is tossed 4 times the possibilities of combinations are  HHHH  HHHT, HHTH, HTHH, THHHH  HHTT,HTHT, HTTH, THHT,
5-2 Probability Models The Binomial Distribution and Probability Model.
1 Keep Life Simple! We live and work and dream, Each has his little scheme, Sometimes we laugh; sometimes we cry, And thus the days go by.
Binomial Distributions Chapter 5.3 – Probability Distributions and Predictions Mathematics of Data Management (Nelson) MDM 4U.
Section 7.1 Discrete and Continuous Random Variables
AP Statistics, Section 7.11 The Practice of Statistics Third Edition Chapter 7: Random Variables 7.1 Discete and Continuous Random Variables Copyright.
1. 2 At the end of the lesson, students will be able to (c)Understand the Binomial distribution B(n,p) (d) find the mean and variance of Binomial distribution.
Section 7.3. Why we need Bayes?  How to assess the probability that a particular event occurs on the basis of partial evidence.  The probability p(F)
Evaluating Hypotheses. Outline Empirically evaluating the accuracy of hypotheses is fundamental to machine learning – How well does this estimate accuracy.
AP Statistics Section 7.1 Probability Distributions.
Section Discrete and Continuous Random Variables AP Statistics.
“Teach A Level Maths” Statistics 1
Random Variables/ Probability Models
Random Variables Random variables assigns a number to each outcome of a random circumstance, or equivalently, a random variable assigns a number to each.
Lecture 8.
“Teach A Level Maths” Statistics 1
UNIT 8 Discrete Probability Distributions
Binomial Distribution
HUDM4122 Probability and Statistical Inference
Daniela Stan Raicu School of CTI, DePaul University
Discrete Random Variables 2
Chapter 4 STAT 315 Nutan S. Mishra.
Quantitative Methods Varsha Varde.
Data Analysis and Statistical Software I ( ) Quarter: Autumn 02/03
Section Probability Models
“Teach A Level Maths” Statistics 1
Section 6.2 Probability Models
Probability The branch of mathematics that describes the pattern of chance outcome.
Daniela Stan Raicu School of CTI, DePaul University
Bernoulli's choice: Heads or Tails?
Daniela Stan Raicu School of CTI, DePaul University
Section 7.1 Discrete and Continuous Random Variables
Pascal’s Arithmetic Triangle
Section 7.1 Discrete and Continuous Random Variables
Sample Spaces and Probability
Presentation transcript:

1 Variance of RVs Supplementary Notes Prepared by Raymond Wong Presented by Raymond Wong

2 e.g.1 (Page 3) The distribution function D of a random variable X with finitely many values is the function on the values of X defined by D(x) = P(X=x) The distribution function of the random variable X assigns each value x of the random variable the probability that X achieves that value.

3 Visualize the distribution function using a diagram called a histogram. Graphs that show, for each integer value x of X, a rectangle of width 1 centered at x, whose height (and thus area) is proportional to the probability P(X=x)

4 X = 0 X = 1 X = 2 X = 3 Sample Space Let X be a random variable denoting a number equal to 0, 1, 2, or 3. The sample space where we consider random variable X is 1/8 3/8 1/8 x P(X = x) The histogram is X P(X = x) /8 2/8 3/8

5 Expected Value Consider that we flip a coin n times. We flip the coin 100 times. The expected number of heads is 50. To what extend do we expect to see 50 heads? Is it surprising to see 55, 60 or 65 heads instead? General Question: How much do we expect a random variable to deviate from its expected value.

6 10 flips Area of rectangles with bases from x = a to x = b is probability that X is between a and b Cumulative distribution function D: D(a, b) = P(a  X  b) e.g., D(2, 3) = P(2  X  3) = P(X = 2) + P(X = 3)

7 10 flips 25 flips ~7 We observe that the results are not spread as broadly (relatively speaking) for 25 flips (compared with 10 flips) Virtually, all results lie between 5 and 20.

8 25 flips ~7 100 flips 5065 ~15 35 We observe that the spread has only doubled even though the number of trials has quadrupled. = 4 x 25  2 x 7

9 100 flips400 flips 5065 ~ ~ We observe that the spread has only doubled even though the number of trials has quadrupled. = 4 x 100 = 2 x 30

flips ~ The curve is quite similar to the bell-shaped curve, called normal curve. We want to study the “spread” mathematically.

11 We have seen the scenario of flipping a fair coin n times. Consider another scenario that a student answers n questions in an exam where he can answer a question correctly with probability 0.8.

12 10 question 25 questions ~5 8 We observe that the results are not spread as broadly (relatively speaking) for 25 questions (compared with 10 questions) Virtually, all results lie between 14 and 25.

13 25 questions ~5 100 questions 8091 ~11 69 We observe that the spread has only doubled even though the number of trials has quadrupled. = 4 x 25  2 x 5

questions 400 questions 8091 ~ ~ We observe that the spread has only doubled even though the number of trials has quadrupled. = 4 x 100 = 2 x 22

questions ~ The curve is quite similar to the bell-shaped curve, called normal curve. We want to study the “spread” mathematically. However, the curves might show asymmetry. 10 question

16 e.g.2 (Page 12) Illustration of Lemma 5.26 Lemma 5.26 If X is a random variable that always takes on the value 5, then E(X) = 5 X = 5 Sample Space P(X=5)=1 Lemma 5.26 If X is a random variable that always takes on the value c, then E(X) = c Why is it correct?

17 e.g.2 Lemma 5.26 If X is a random variable that always takes on the value c, then E(X) = c Why is it correct? E(X) X = 5 Sample Space P(X=5)=1 = c P(X = c) = c. 1 = c

18 e.g.3 (Page 12) Corollary 5.27 Let X be a random variable on a sample space. Then E(E(X)) = E(X). Why is it correct? Note that E(X) is a value (or a constant). e.g., expected number of heads when flipping coins Let E(X) be  where  is a value. Consider E(E(X)) = E(  ) =  (By Lemma 5.26) = E(X)

19 e.g.4 (Page 13) We would like to have some way of measuring the deviation of X from E(X). That is, how does Y = X – E(X) behave? Our first attempt is to look at the expectation of Y. Consider E(X – E(X)) = E(X) – E(E(X)) = E(X) – E(X) = 0 Thus, E(Y) is identically zero, and is not a useful measure of how close a random variable is to its expectation. How about E(Y 2 )?

20 e.g.5 (Page 14) Variance Let X be a random variable. We define the variance of X denoted by V(X) to be the expected value of (X – E(X)) 2 (i.e., E( (X-E(X)) 2 ) That is, The sample space for random variable X The sample space for the original application

21 e.g.5 Variance of a Random Variable X a.V(X) = E( ( X – E(X) ) 2 ) x1x1 x3x3 x2x2 … xnxn Sample Space s1s1 s3s3 s2s2 … snsn Based on outcome Based on X Number (e.g., 2) Outcome (e.g., HTH) X(s 3 ) (e.g., X(HTH) = 2) b. c.

22 TTTT TTTH TTHT THTT HTTT HTTH HTHT HHTT Sample Space TTHH THTH THHT THHH HTHH HHTH HHHT HHHH e.g.6 (Page 14) Suppose that we want to flipping 4 coins. Let X be the random variable denoting the number of heads. The sample space of flipping 4 coins is 4 tails 1/16 3 tails 1/16 3 tails 2 tails 1 tail 2 tails 1 tail 0 tail 1/16 We are interested in the sample space where each blue point is in the format of “X = ?” X: random variable denoting the number of heads when we flip 4 coins. (a) What is E(X)? (b) What is V(X)? (a)What is E(X)? (b) What is V(X)?

23 TTTT TTTH TTHT THTT HTTT HTTH HTHT HHTT Sample Space TTHH THTH THHT THHH HTHH HHTH HHHT HHHH e.g.6 4 tails 1/16 3 tails 1/16 3 tails 2 tails 1 tail 2 tails 1 tail 0 tail 1/16 We are interested in the sample space where each blue point is in the format of “X = ?” The sample space of flipping 4 coins is X: random variable denoting the number of heads when we flip 4 coins. (a)What is E(X)? (b) What is V(X)?

24 X = 0 X = 1 X = 2 X = 3 Sample Space X = 4 TTTT TTTH TTHT THTT HTTT HTTH HTHT HHTT Sample Space TTHH THTH THHT THHH HTHH HHTH HHHT HHHH e.g.6 4 tails 1/16 3 tails 1/16 3 tails 2 tails 1 tail 2 tails 1 tail 0 tail 1/16 The sample space of flipping 4 coins is The sample space where we are interested in X is 1/16 4/166/164/161/16 (or ¼) (or 3/8)(or ¼) X: random variable denoting the number of heads when we flip 4 coins. (a)What is E(X)? (b) What is V(X)?

25 X = 0 X = 1 X = 2 X = 3 Sample Space X = 4 e.g.6 The sample space where we are interested in X is 1/16 4/166/164/161/16 (or ¼) (or 3/8)(or ¼) X: random variable denoting the number of heads when we flip 4 coins. (a)What is E(X)? (b) What is V(X)?

26 X = 0 X = 1 X = 2 X = 3 Sample Space X = 4 e.g.6 The sample space where we are interested in X is 1/16 4/166/164/161/16 (or ¼) (or 3/8)(or ¼) (a) What is E(X)? (b) What is V(X)? (a) E(X) = 0x1/16 + 1x1/4 + 2x3/8 + 3x1/4 + 4x1/16 = 2 (b)V(X) = E((X-E(X)) 2 ) = E((X-2) 2 ) = (0-2) 2 x 1/16 + (1-2) 2 x ¼ + (2-2) 2 x 3/8 + (3-2) 2 x ¼ + (4-2) 2 x 1/16 = 1 X: random variable denoting the number of heads when we flip 4 coins. (a)What is E(X)? (b) What is V(X)? V(Y) = 1 where Y:random variable denoting the number of heads when we flip 4 coins

27 e.g.7 (Page 15) Calculating variances from scratch is very time-consuming. Many random variables X such as the binomial distribution can actually be built as the sum of simpler random variables (e.g., X = X 1 + X 2 + X 3 ). We know that E(X) = E(X 1 ) + E(X 2 ) + E(E 3 ) Do you think that V(X) = V(X 1 ) + V(X 2 ) + V(X 3 )? We will answer this question in some slides later? V(Y) = 1 where Y:random variable denoting the number of heads when we flip 4 coins

28 T H Sample Space e.g.8 (Page 16) Suppose that we want to flip ONE coin. Let X be the random variable denoting the number of heads. The sample space of flipping ONE coin is 0 head 1/2 1 head 1/2 We are interested in the sample space where each blue point is in the format of “X = ?” X: random variable denoting the number of heads when we flip ONE coin. V(Y) = 1 where Y:random variable denoting the number of heads when we flip 4 coins (a) What is E(X)? (b) What is V(X)? (a)What is E(X)? (b) What is V(X)?

29 T H Sample Space e.g.8 The sample space of flipping ONE coin is 0 head 1/2 1 head 1/2 We are interested in the sample space where each blue point is in the format of “X = ?” X: random variable denoting the number of heads when we flip ONE coin. V(Y) = 1 where Y:random variable denoting the number of heads when we flip 4 coins (a)What is E(X)? (b) What is V(X)?

30 T H Sample Space e.g.8 The sample space of flipping ONE coin is 0 head 1/2 1 head 1/2 The sample space where we are interested in X is X = 0 X = 1 Sample Space 1/2 X: random variable denoting the number of heads when we flip ONE coin. V(Y) = 1 where Y:random variable denoting the number of heads when we flip 4 coins (a)What is E(X)? (b) What is V(X)?

31 e.g.8 The sample space where we are interested in X is X = 0 X = 1 Sample Space 1/2 X: random variable denoting the number of heads when we flip ONE coin. V(Y) = 1 where Y:random variable denoting the number of heads when we flip 4 coins (a)What is E(X)? (b) What is V(X)?

32 e.g.8 The sample space where we are interested in X is X = 0 X = 1 Sample Space 1/2 (a) What is E(X)? (b) What is V(X)? (a) E(X) = 0x1/2 + 1x1/2 = 1/2 (b)V(X) = E((X-E(X)) 2 ) = E((X-1/2) 2 ) = (0-1/2) 2 x 1/2 + (1-1/2) 2 x 1/2 = 1/4 X: random variable denoting the number of heads when we flip ONE coin. V(Y) = 1 where Y:random variable denoting the number of heads when we flip 4 coins Consider we flip FOUR coins Let X i be the random variable denoting the number of heads for the i-th flip Y = X 1 + X 2 + X 3 + X 4 From this example, we observe that V(Y)=V(X 1 )+V(X 2 )+V(X 3 )+V(X 4 ) (i.e., V(Y)=4V(X)) (a)What is E(X)? (b) What is V(X)?

33 RRRRR RWRRR Sample Space RRWRR …RRRWR e.g.9 (Page 17) Suppose that a student answers 5 questions in an exam. Suppose that he can answer a question correctly with probability = 0.8. Let X be the random variable denoting the number of questions he answers correctly. The sample space of answering 5 questions is 5 correct correct We are interested in the sample space where each blue point is in the format of “X = ?” X: random variable denoting the number of questions he answers correctly for 5 Qs. 4 correct correct (a) What is E(X)? (b) What is V(X)? (a)What is E(X)? (b) What is V(X)?

34 RRRRR RWRRR Sample Space RRWRR …RRRWR e.g.9 The sample space of answering 5 questions is 5 correct correct We are interested in the sample space where each blue point is in the format of “X = ?” 4 correct correct X: random variable denoting the number of questions he answers correctly for 5 Qs. (a)What is E(X)? (b) What is V(X)?

35 X = 0 X = 1 Sample Space X = 2 X = 3X = 4X = 5 RRRRR RWRRR Sample Space RRWRR …RRRWR e.g.9 The sample space of answering 5 questions is 5 correct correct correct correct X: random variable denoting the number of questions he answers correctly for 5 Qs. The sample space where we are interested in X is Note that answering 5 questions is a Bernoulli trial process with success probability = (a)What is E(X)? (b) What is V(X)?

36 X = 0 X = 1 Sample Space X = 2 X = 3X = 4X = 5 e.g.9 X: random variable denoting the number of questions he answers correctly for 5 Qs. The sample space where we are interested in X is (a)What is E(X)? (b) What is V(X)?

37 X = 0 X = 1 Sample Space X = 2 X = 3X = 4X = 5 e.g.9 X: random variable denoting the number of questions he answers correctly for 5 Qs. The sample space where we are interested in X is (a) What is E(X)? (b) What is V(X)? (a)What is E(X)? (b) What is V(X)? (a) E(X) = x x x x x x 5 = 4 (b)V(X) = (0 – 4) 2 x (1 – 4) 2 x (2 – 4) 2 x (3 – 4) 2 x (4 – 4) 2 x (5 – 4) 2 x OR We know that this is a Bernoulli trial process with 5 trials and success prob. = 0.8 E(X) = 5 x 0.8 = 4 = 0.8 V(X) = 0.8

38 e.g.9 X: random variable denoting the number of questions he answers correctly for 5 Qs. V(X) = 0.8 X i : random variable denoting the number of questions he answers correctly for the i-th question. (a) What is E(X i )? (b) What is V(X i )? W R Sample Space The sample space of answering the i-th question is 0 question question 0.8 We are interested in the sample space where each blue point is in the format of “X = ?” X = 0 X = 1 Sample Space (a)E(X i ) = 0.2 x x 1 = 0.8 OR We know that this is a Bernoulli trial process with 1 trial and success prob. = 0.8 E(X i ) = 1x0.8 = 0.8 (b) V(X i ) = (0-0.8) 2 x0.2 + (1-0.8) 2 x0.8 = 0.16 V(X i ) = 0.16 From this example, we observe that V(X)=V(X 1 )+V(X 2 )+V(X 3 )+V(X 4 )+V(X 5 ) (i.e., V(X)=5V(X i ))

39 e.g.10 (Page 18) Suppose that the bag contains two coins (a $1 coin and a $5 coin) (a) Suppose that we withdraw one coin from the bag. Let X 1 be the random variable denoting the amount of money we obtain. (i) What is E(X 1 )? (ii) What is V(X 1 )? Bag:{$1, $5} X 1 : random variable denoting the amount of money we obtain (a) (i) What is E(X 1 )? (ii) What is V(X 1 )?

40 e.g.10 $1 $5 Sample Space We are interested in the sample space where each blue point is in the format of “X = ?” X = 1 X = 5 Sample Space 0.5 The sample space of withdrawing a coin is (a)(i)E(X i ) = 0.5 x x 5 = 3 (ii) V(X i ) = (1-3) 2 x0.5 + (5-3) 2 x0.5 = 4 Bag:{$1, $5} X 1 : random variable denoting the amount of money we obtain (a) (i) What is E(X 1 )? (ii) What is V(X 1 )? E(X 1 ) = 3 V(X 1 ) = 4

41 e.g.10 Bag:{$1, $5} X 1 : random variable denoting the amount of money we obtain E(X 1 ) = 3 V(X 1 ) = 4 (b) Suppose that we withdraw two coins from the bag (one after the other), without replacement. Let X 1 be the random variable denoting the amount of money we obtain for the 1 st draw. Let X 2 be the random variable denoting the amount of money we obtain for the 2 nd draw. Let X be the random variable denoting the amount of money we obtain after these two draws. (NOTE: X = X 1 + X 2 ) (i) What is E(X 1 )? (ii) What is V(X 1 )? (iii) What is E(X 2 )? (iv) What is V(X 2 )? (v) What is E(X)? (vi) What is V(X)? X 1 : random variable denoting the amount of money we obtain for the first draw X 2 : random variable denoting the amount of money we obtain for the second draw X: random variable denoting the amount of money we obtain after these two draws (NOTE: X = X 1 + X 2 ) (i) What is E(X 1 )? (ii) What is V(X 1 )? (iii) What is E(X 2 )? (iv) What is V(X 2 )? (v) What is E(X)? (vi) What is V(X)?

42 e.g.10 Bag:{$1, $5} X 1 : random variable denoting the amount of money we obtain E(X 1 ) = 3 V(X 1 ) = 4 X 1 : random variable denoting the amount of money we obtain for the first draw X 2 : random variable denoting the amount of money we obtain for the second draw X: random variable denoting the amount of money we obtain after these two draws (NOTE: X = X 1 + X 2 ) (i) What is E(X 1 )? (ii) What is V(X 1 )? (iii) What is E(X 2 )? (iv) What is V(X 2 )? (v) What is E(X)? (vi) What is V(X)? (i) E(X 1 ) = 3 (ii) V(X 1 ) = 4 (iii) E(X 2 ) = 3 (iv) V(X 2 ) = 4 (v) There are two cases. Case 1: $1  $5 Case 2: $5  $1 X = = 6X = = 6 P(Case 1) = 0.5P(Case 2) = 0.5 E(X) = 6x x0.5 = 6 (vi) V(X) = (6-6) 2 x0.5 + (6-6) 2 x0.5 = 0 From this example, we observe that V(X)  V(X 1 )+V(X 2 ) (i.e., V(X)  2V(X i ))

43 e.g.11 (Page 20) We know that E(X 1 +X 2 ) = E(X 1 ) + E(X 2 ) In previous examples, we observe that In some cases, V(X 1 +X 2 ) = V(X 1 ) + V(X 2 ) In other cases, V(X 1 +X 2 )  V(X 1 ) + V(X 2 ) X 1 and X 2 are “independent”. X 1 and X 2 are not “independent”.

44 e.g.12 (Page 21) Let X be a random variable. Let Y be a random variable. X and Y are independent when “X has value x” is independent of “Y has value y”, regardless of choice of x and y. Formally, X and Y are independent if and only if for all values x, y, P((X=x)  (Y=y)) = P(X = x). P(Y=y)

45 e.g.13 (Page 21) E.g., Suppose that we roll two dice. X is the random variable denoting the amount rolled on the first dice. Y is the random variable denoting the amount rolled on the second dice. X and Y are independent because for every 1  i, j  6, P((X=i)  (Y=j)) = P(X = i). P(Y=j)

46 e.g.14 (Page 22) We want to show that: Let X 1 be a random variable Let X 2 be a random variable. If X 1 and X 2 are independent, then V(X 1 +X 2 ) = V(X 1 ) + V(X 2 ) Before we show the above statement, we need to illustrate some other statements.

47 e.g.14 Lemma 5.28 If X and Y are independent random variables on sample space S with values x 1, x 2, …, x k and y 1, y 2, …, y m, respectively, then E(XY) = E(X)E(Y) Why is it correct? The answer can be found in the appendix.

48 e.g.22 (Page 25) Suppose that we flip two fair coins. We have two random variables X and Y. X = 1 if head for coin 1 0 if tail for coin 1 Y = 0 if head for coin 2 1 if tail for coin 2 (a) What is E(X)? (b) What is E(Y)? (c) What is E(XY)? (d) Is “E(XY) = E(X)E(Y)”? (a) E(X) = 1x1/2 + 0x1/2 = 1/2 (b) E(Y) = 0x1/2 + 1x1/2 = 1/2 (c)There are 4 cases. Case 1: HH Case 2: HT Case 3: TH Case 4: TT X = 1Y = 0XY = 0 X = 1Y = 1XY = 1 X = 0Y = 0XY = 0 X = 0Y = 1XY = 0 1/4 E(XY) = 0x1/4 + 1x1/4 + 0x1/4 + 0x1/4 = 1/4 (d) Yes. E(X)E(Y) =1/4

49 e.g.22 Suppose that we flip two fair coins. We have two random variables X and Y. X = 1 if head for coin 1 0 if tail for coin 1 Y = 0 if head for coin 2 1 if tail for coin 2 (a) What is E(X)? (b) What is E(Y)? (c) What is E(XY)? (d) Is “E(XY) = E(X)E(Y)”? Conclusion: If X and Y are independent, E(XY) = E(X)E(Y)

50 e.g.23 (Page 25) Suppose that we flip one fair coin. We have two random variables X and Z. X = 1 if head 0 if tail Z = 1- X (a) What is E(X)? (b) What is E(Z)? (c) What is E(XZ)? (d) Is “E(XZ) = E(X)E(Z)”? (a) E(X) = 1x1/2 + 0x1/2 = 1/2 (b) E(Z) = 0x1/2 + 1x1/2 = 1/2 (c)There are 2 cases. Case 1: H Case 2: T X = 1Z = 0XZ = 0 X = 0Z = 1XZ = 0 1/2 E(XZ) = 0x1/2 + 0x1/2 (d) No. E(X)E(Z) =1/4 E(XZ) = 0 Thus, E(X)E(Z)  E(XZ) Z = 0 if X = 1 (head) 1 if X = 0 (tail) = 0

51 e.g.23 Suppose that we flip one fair coin. We have two random variables X and Z. X = 1 if head 0 if tail Z = 1- X (a) What is E(X)? (b) What is E(Z)? (c) What is E(XZ)? (d) Is “E(XZ) = E(X)E(Z)”? Conclusion: If X and Z are NOT independent, E(XZ)  E(X)E(Z)

52 e.g.24 (Page 26) Theorem 5.29 If X and Y are independent random variables, then V(X+Y) = V(X) + V(Y) Why is it correct? V(X+Y) V(X) = E( (X – E(X)) 2 ) = E( ( (X+Y) – [E(X+Y)] ) 2 ) = E( ( (X+Y) – [E(X)+E(Y)] ) 2 ) (By Linearity of Expectation) = E( ( X+Y – E(X) – E(Y) ) 2 ) = E( ( X – E(X) + Y – E(Y) ) 2 ) = E( ( [X – E(X)] + [Y – E(Y)] ) 2 ) = E( [X – E(X)] [X – E(X)] [Y – E(Y)] + [Y – E(Y)] 2 ) = E( ( (X+Y) – E(X+Y) ) 2 ) = E( [X – E(X)] 2 ) + E(2 [X – E(X)] [Y – E(Y)]) + E( [Y – E(Y)] 2 ) (By Linearity of Expectation) = E( [X – E(X)] 2 ) + 2E([X – E(X)] [Y – E(Y)]) + E( [Y – E(Y)] 2 ) = V(X) + 2E([X – E(X)] [Y – E(Y)]) + V(Y) = V(X) + V(Y) + 2E([X – E(X)] [Y – E(Y)]) If I can prove that “E([X – E(X)] [Y – E(Y)]) = 0”, then “V(X+Y) = V(X) + V(Y)” In the following, we will prove “E([X – E(X)] [Y – E(Y)]) = 0”

53 e.g.24 In the following, we will prove “E([X – E(X)] [Y – E(Y)]) = 0” Consider E([X – E(X)] [Y – E(Y)]) = E( XY – X. E(Y) – E(X). Y + E(X)E(Y) ) = E(XY) – E(X. E(Y)) – E(E(X). Y) + E( E(X)E(Y) ) (By Linearity of Expectation) = E(XY) – E(Y). E(X) – E(X). E(Y) + E( E(X)E(Y) ) If c is a constant and X is a random variable, E(X. c) = cE(X) = E(XY) – E(Y). E(X) – E(X). E(Y) + E(X)E(Y) If c 1 and c 2 are constant, E(c 1. c 2 ) = c 1. c 2 = E(X)E(Y) – E(Y). E(X) – E(X). E(Y) + E(X)E(Y) (By Lemma 5.28 (i.e., E(XY) = E(X)E(Y)) since X and Y are independent) = 0 Done! From the previous slide, V(X+Y) = V(X) + V(Y) + 2E([X – E(X)] [Y – E(Y)]) Thus, V(X+Y) = V(X) + V(Y)

54 e.g.25 (Page 29) Suppose that we flip one fair coin. We have a random variable X. X = 1 if head 0 if tail From some examples we illustrated before, we know that V(X) = 1/4 Suppose that we flip one fair coin 10 times. We have random variables X 1, X 2, …, X 10. X i = 1 if head for the i-th flip 0 if tail for the i-th flip (a) According to Theorem 5.29, what is V(X 1 + X 2 + … + X 10 )? (b) According to Theorem 5.29, what is V(X 1 + X 2 + … + X 100 )? (c) According to Theorem 5.29, what is V(X 1 + X 2 + … + X 400 )? (a)V(X 1 + X 2 + … + X 10 ) = V(X 1 ) + V(X 2 ) + … + V(X 10 )= ¼. 10= 5/2 (b)V(X 1 + X 2 + … + X 100 ) = V(X 1 ) + V(X 2 ) + … + V(X 100 )= ¼. 100= 25 (c)V(X 1 + X 2 + … + X 400 ) = V(X 1 ) + V(X 2 ) + … + V(X 400 )= ¼. 400= 100

55 e.g.25 (a)V(X 1 + X 2 + … + X 10 ) = V(X 1 ) + V(X 2 ) + … + V(X 10 )= ¼. 10= 5/2 (b)V(X 1 + X 2 + … + X 100 ) = V(X 1 ) + V(X 2 ) + … + V(X 100 )= ¼. 100= 25 (c)V(X 1 + X 2 + … + X 400 ) = V(X 1 ) + V(X 2 ) + … + V(X 400 )= ¼. 400= 100 Conclusion: Flipping the coins 10 times  variance = 5/2 Flipping the coins 100 times  variance = 25 Flipping the coins 400 times  variance = 100

56 e.g.26 (Page 30) Theorem 5.x In a Bernoulli trials process with n trials in which each experiment has two outcomes and probability p of success, the Variance of the outcome is np(1-p) Why is it correct? X i = 1 if success for the i-th trial 0 if fail for the i-th trial p 1-p E(X i ) V(X i ) = (1-E(X i )) 2 xp + (0-E(X i )) 2 x(1-p) = 1xp + 0x(1-p) = p X i = 1 X i = 0 Sample Space p 1-p = (1-p) 2 xp + (0-p) 2 x(1-p) = (1-p) 2 xp + p 2 x(1-p) = (1-p)x(1-p)xp + pxpx(1-p) = (1-p)xpx((1-p) + p) = (1-p)xp x 1 = (1-p)xp By Theorem 5.29, we know V(X) = V(X 1 +X 2 +…+X n ) X: number of successes (i.e., X = X 1 + X 2 + … + X n ) =V(X 1 )+V(X 2 )+…+V(X n ) =(1-p)xp + (1-p)xp + …+ (1-p)xp =n(1-p)xp =np(1-p) Done!

57 e.g.27 (Page 31) Let X be a random variable. Standard derivation of X, denoted by   (X), is defined to be Sometimes, we write  only.

58 e.g.28 (Page 32) Conclusion: Flipping the coins 100 times  variance = 25 Flipping the coins 400 times  variance = flips400 flips 3 33 33 33 This means that “most” data can be found with  3 standard deviations from the expected value.

59 e.g.28 Conclusion: Flipping the coins 100 times  variance = 25 Flipping the coins 400 times  variance = 100 This means that “most” data can be found with  3 standard deviations from the expected value. How about flipping the coins 25 times? Since flipping the coin 25 times (with success prob. = 0.5) is a Bernoulli trial process, we know that the variance is np(1-p) = 25x1/2x(1-1/2) = 25/4 33 33 (i.e., 15/2) We know that “most” data can be found with  15/2 from the expected value.

60 e.g.29 (Page 33) Central Limit Theorem   Within  : About 68% data can be found with 1 standard deviation from the expected value. Within 2  : About 95.5% data can be found with 2 standard deviations from the expected value. 22 22 P(X = x) X X

61 e.g.29 Central Limit Theorem Within 3  : About 99.7% data can be found with 3 standard deviations from the expected value. 33 33 P(X = x) X

62 e.g.29 Central Limit Theorem a b P(a  X  b) = P(X = x) X This is called a normal distribution.

63 e.g.30 (Page 36) Suppose that we want to be 95% sure that the number of heads in n coin flips is within  1% of the expected value, how big does n have to be? X:number of heads in n coin flips 1% of E(X) P(X = x) X 1% of E(X) This is a Bernoulli trial process with n trials and 0.5 success probability Variance = np(1-p) = n. 1/2. (1-1/2) = n. 1/2. 1/2 = n. 1/4 = n/4 E(X) = np= n. 1/2 = n/2 Within 2  : 22 22 P(X = x) X By Central Limit Theorem, we know this graph: About 95.5% data can be found with 2 standard deviations from the expected value. Thus, we have 2  =0.01 x E(X) 2 x =0.01 x n/2

64 e.g.30 Suppose that we want to be 95% sure that the number of heads in n coin flips is within  1% of the expected value, how big does n have to be? X:number of heads in n coin flips 1% of E(X) P(X = x) X 1% of E(X) 2 x =0.01 x n/2

65 e.g.30 Suppose that we want to be 95% sure that the number of heads in n coin flips is within  1% of the expected value, how big does n have to be? X:number of heads in n coin flips 1% of E(X) P(X = x) X 1% of E(X) 2 x =0.01 x n/2 =0.005 x n1 =0.005 x n/1 =0.005 x1/0.005 = n =(1/0.005) 2 n =40000 Therefore, if we flip the coin times, then we are 95% sure that the number of heads is within  1% of the expected value

66 Appendix

67 e.g.14 Outline: In the following, we want to prove the correctness of the following lemma with some explanations Lemma 5.28 If X and Y are independent random variables on sample space S with values x 1, x 2, …, x k and y 1, y 2, …, y m, respectively, then E(XY) = E(X)E(Y)

68 e.g.15 (Page 23) Let X be a set = {4, 5, 6} Let Y be a set = {6, 9} Given a particular x i, what does mean? x1x1 x2x2 x3x3 y1y1 y2y2

69 e.g.16 (Page 23) Let X be a set = {4, 5, 6} Let Y be a set = {6, 9} What does mean? x1x1 x2x2 x3x3 y1y1 y2y2

70 e.g.17 (Page 23) Let X be a set = {4, 5, 6} Let Y be a set = {6, 9} What is the set H of all possible values of x. y where x  X and y  Y ? x1x1 x2x2 x3x3 y1y1 y2y2 xyx.yx.y The set of all possible values is {24, 36, 30, 45, 54} H: all possible values of x. y where x  X and y  Y H = {24, 36, 30, 45, 54}

71 e.g.18 (Page 23) Let X be a set = {4, 5, 6} Let Y be a set = {6, 9} What does mean? x1x1 x2x2 x3x3 y1y1 y2y2 H: all possible values of x. y where x  X and y  Y H = {24, 36, 30, 45, 54} Show that it equals

72 e.g.18 Let X be a set = {4, 5, 6} Let Y be a set = {6, 9} What does mean? x1x1 x2x2 x3x3 y1y1 y2y2 H: all possible values of x. y where x  X and y  Y H = {24, 36, 30, 45, 54} Show that it equals

73 e.g.18 Let X be a set = {4, 5, 6} Let Y be a set = {6, 9} What does mean? x1x1 x2x2 x3x3 y1y1 y2y2 H: all possible values of x. y where x  X and y  Y H = {24, 36, 30, 45, 54} Note that x 1 y 2 = 4. 9=36 Note that x 3 y 1 = 6. 6=36 Show that it equals

74 e.g.18 Let X be a set = {4, 5, 6} Let Y be a set = {6, 9} What does mean? x1x1 x2x2 x3x3 y1y1 y2y2 H: all possible values of x. y where x  X and y  Y H = {24, 36, 30, 45, 54} Show that it equals

75 e.g.18 Let X be a set = {4, 5, 6} Let Y be a set = {6, 9} What does mean? x1x1 x2x2 x3x3 y1y1 y2y2 H: all possible values of x. y where x  X and y  Y H = {24, 36, 30, 45, 54} Show that it equals z is a value of XY

76 e.g.19 (Page 23) We know that z is a value of XY H: all possible values of x. y where x  X and y  Y H = {24, 36, 30, 45, 54} Let X be a random variable and Y be a random variable. Do you think that the following is correct? z is a value of XY Yes.

77 e.g.20 (Page 23) Let X be a random variable with values x 1, x 2, …, x k and Y be a random variable with values y 1, y 2, …, y m. Do you think that the following is correct? Yes.

78 e.g.21 (Page 23) Lemma 5.28 If X and Y are independent random variables on sample space S with values x 1, x 2, …, x k and y 1, y 2, …, y m, respectively, then E(XY) = E(X)E(Y) Why is it correct? Consider E(X)E(Y) (By the equation on the previous slide) (because X and Y are independent) Done!