4.2 Random Variables and Their Probability distributions Streamlining Probability: Probability Distribution, Expected Value and Standard Deviation of Random Variable
Graphically and Numerically Summarize a Random Experiment Principal vehicle by which we do this: random variables
Random Variables Definition: A random variable is a numerical-valued variable whose value is based on the outcome of a random event. Denoted by upper-case letters X, Y, etc. 2 types of random variables: Discrete: possible outcomes are a set of separate values, “the number of …” Continuous: possible outcomes are an infinite continuum
Examples X = payout by insurance company on an iPhone6 damage protection policy Possible values of X are x=$0, $250, $500 Y=score on 13th hole (par 5) at Augusta National golf course for a randomly selected golfer on day 1 of 2015 Masters y=3, 4, 5, 6, 7
Random Variables and Probability Distributions A probability distribution lists the possible values of a random variable and the probability that each value will occur. Random variables are unknown chance outcomes. Probability distributions tell us what is likely to happen. Data variables are known outcomes. Data distributions tell us what happened.
Probability Histogram Probability Distribution Of Payout by Insurance Company on an iPhone6 Damage Protection Policy Policy payouts based on estimates of damaged/ruined cellphones. x 250 500 p(x) 0.67 0.13 0.20 Probability Histogram
Probability Histogram Probability Distribution Of Score on 13th hole (par 5) at Augusta National Golf Course on Day 1 of 2015 Masters y 3 4 5 6 7 p(x) 0.040 0.414 0.465 0.051 0.030 Probability Histogram
Probability distributions-discrete random variables Requirements 1. 0 p(x) 1 for all values x of X 2. all x p(x) = 1
Probability distributions-continuous random variables 2 types of random variables: Discrete: possible outcomes are a set of separate values, “the number of …” Continuous: possible outcomes are an infinite continuum Probability distribution graphs for continuous random variables come in many shapes. The shape depends on the probability distribution of the continuous random variable that the graph represents.
Example graphs of probability distribution functions of continuous random variables f(x) f(x) f(x)
Probabilities: area under graph P(a < X < b) X P(a < X < b) = area under the graph between a and b.
Probability distribution function of continuous rv graph 0 the total area under the graph = 1 0 p(x) 1 p(x)=1 The sum of all the areas is 1 Think of p(x) as the area of rectangle above x Total area under curve =1 x
Expected Value of a Random Variable A measure of the “middle” of the values of a random variable
The mean of the probability distribution is the expected value of X, denoted E(X) E(X) is also denoted by the Greek letter µ (mu)
k = the number of possible values of random variable x 250 500 p(x) 0.67 0.13 0.20 Mean or Expected Value y 3 4 5 6 7 p(x) 0.040 0.414 0.465 0.051 0.030 k = the number of possible values of random variable E(x)= µ = x1·p(x1) + x2·p(x2) + x3·p(x3) + ... + xk·p(xk) Weighted mean
Mean or Expected Value k = the number of outcomes µ = x1·p(x1) + x2·p(x2) + x3·p(x3) + ... + xk·p(xk) Weighted mean Each outcome is weighted by its probability
Other Weighted Means GPA A=4, B=3, C=2, D=1, F=0 Course grade: tests 40%, final exam 25%, quizzes 25%, homework 10% "Average" ticket prices
E(Y)= µ=3(.04)+4(0.414)+5(0.465)+6(0.051)+7(0.03) =4.617 strokes x 250 500 p(x) 0.67 0.13 0.20 Mean or Expected Value y 3 4 5 6 7 p(x) 0.040 0.414 0.465 0.051 0.030 E(X)= µ =0(0.67)+250(0.13)+500(0.20) =32.5 + 100 = 132.5 E(Y)= µ=3(.04)+4(0.414)+5(0.465)+6(0.051)+7(0.03) =4.617 strokes
Mean or Expected Value µ=4.617 E(Y)= =3(.04)+4(0.414)+5(0.465)+6(0.051)+7(0.03) =4.617 strokes
Interpretation of E(X) E(X) is a “long run” average. The expected value of a random variable is equal to the average value of the random variable if the chance process was repeated an infinite number of times. In reality, if the chance process is continually repeated, x will get closer to E(x) as you observe more and more values of the random variable x.
Example Let X = number of heads in 3 tosses of a fair coin. So the probability distribution of X is: x 0 1 2 3 p(x) 1/8 3/8 3/8 1/8 Example Let X = number of heads in 3 tosses of a fair coin.
US Roulette Wheel and Table American Roulette 0 - 00 (The European version has only one 0.) The roulette wheel has alternating black and red slots numbered 1 through 36. There are also 2 green slots numbered 0 and 00. A bet on any one of the 38 numbers (1-36, 0, or 00) pays odds of 35:1; that is . . . If you bet $1 on the winning number, you receive $36, so your winnings are $35
US Roulette Wheel: Expected Value of a $1 bet on a single number Let x be your winnings resulting from a $1 bet on a single number; x has 2 possible values x -1 35 p(x) 37/38 1/38 E(x)= -1(37/38)+35(1/38)= -.05 So on average the house wins 5 cents on every such bet. A “fair” game would have E(x)=0. The roulette wheels are spinning 24/7, winning big $$ for the house, resulting in …
Standard Deviation of a Random Variable First center (expected value) Now - spread
Standard Deviation of a Random Variable Measures how “spread out” the random variable is
Summarizing data and probability Histogram measure of the center: sample mean x measure of spread: sample standard deviation s statistics Random variable Probability Histogram measure of the center: population mean m measure of spread: population standard deviation s parameters
Variance – measure of spread The deviations of the outcomes from the mean of the probability distribution xi - µ 2 (sigma squared) is the variance of the probability distribution [the variance is also denoted Var(X)]
Variance – measure of spread Variance of random variable X
Var(X) = (x1-µ)2 · P(X=x1) + (x2-µ)2 · P(X=x2) + (x3-µ)2 · P(X=x3) Variance Var(X) x 250 500 p(x) 0.67 0.13 0.20 Example Var(X) = (x1-µ)2 · P(X=x1) + (x2-µ)2 · P(X=x2) + (x3-µ)2 · P(X=x3) = (0-132.5)2 · 0.67 + (250-132.5)2 · 0.13 + (500-132.5)2 · 0.20 = 40,568.75 Recall: µ = E(X)=132.5 132.5 132.5 132.5 P. 207, Handout 4.1, P. 4
Standard Deviation: of More Interest then the Variance
Standard Deviation 2 = 40,568.75 , or SD(X), is the standard deviation of the random variable X