# Chapter 6: Discrete Probability Distributions

## Presentation on theme: "Chapter 6: Discrete Probability Distributions"— Presentation transcript:

Chapter 6: Discrete Probability Distributions
6.1 Discrete Random Variables 6.2 The Binomial Probability Distribution 6.3 The Poisson Probability Distribution November 12, 2008 1

Discrete Random Variables
Consider the probability experiment of flipping a coin two times. The possible outcomes of this experiment, the sample space, is: S = {HH,HT,TH,TT} Where H is a “head” and T is a “tail.” We call the outcome of such an experiment a random event since we are never sure of its value. Mathematicians and statisticians like to work with numbers or things that can be represented by numbers. For this reason, we try to assign the random event a number or a set of numbers. Section 6.1

Random Variable Definition: A random variable is a numerical measure or representation of a random event in a probability experiment. Remark: Since a random variable comes from something that is random (a random event), it may take several numerical values. We will denote the random variable by using capital letters, e.g., X, its values by small letter variables, e.g., x. Example: Flipping a coin twice. S = {HH,HT,TH,TT}. Let X be the random variable that represents the number of tails. Then the number of values assume by X are: x = 0,1,2.

Examples Example: Consider the probability experiment of measuring your systolic blood pressure. This is a random variable X and its range of values is 50 to 230 mm Hg. For example, it is normal if x = We call X a continuous random variable since it can take any value in the interval [50,230]. Example: Consider the probability experiment of rolling once a single die. The random variable X in this case is that a particular number that comes up. The possible values for this random variable are: x = 1, 2, 3, 4, 5 or 6. We call X a discrete random variable since it can take only discrete values.

Random Variable Types A discrete random variable has a finite or countable number of values. That is, the values can be put into a 1-1 correspondence with the integers or a subset of the integers. A continuous random variable has an infinite number of possible values that have a 1-1 corresponds with number in an interval [a,b].

Examples Let X be the random variable of the number of students in Math 127A that will receive A’s in the course. Here, x = 0,1,2,….,132. This is a discrete random variable. Let X be the random variable for the weight of students in Math 127A. Here, x > 0 and this is a continuous random variable.

Discrete Probability Distributions
Definition: Let X be a discrete random variable. The discrete probability distribution of X, which we denote by P, is a function that maps the values of the discrete random X into the interval [0,1]. P is often represented by a graph or table that gives all of the possible values of X (i.e., x) and the corresponding probabilities of x, P(x). Sometimes it is given as a mathematical formula. x P(x) ---

Example Suppose that we perform a probability experiment of flipping a coin 3 times in a row. Let X be the random variable of the number of times that head appears. The possible values for X are x = 0,1,2,3. Using a simulation of flipping the coin 3 times, we find the following: x P(x) 0.10 1 0.01 2 0.51 3 0.38 This is a discrete probability distribution for X. Notice that 0 ≤ P(x) ≤ 1 P(0) + P(1) + P(2) + P(3) = 1

Rules for Discrete Probability Distributions
Remark: If any of the above conditions is violated, then P is not a discrete probability distribution.

Examples Which of the following are discrete probability distributions? x P(x) 1 0.40 2 0.35 3 0.12 4 -0.07 5 0.20 x P(x) 1 0.40 2 0.35 3 0.12 4 0.01 5 0.20 x P(x) 1 0.40 2 0.35 3 0.12 4 0.01 5 No No Yes

Example Consider the following discrete probability distribution for the number of homeruns in a single game by a Boston Red Sox team. Let X be the random variable which is the number of HR hit in a single game. x (# of HR) P(x) 0.23 1 0.38 2 0.22 3 0.13 4 0.03 5 0.01 6 or more 0.00

Question: What is the probability that the team will hit three or more homeruns in a single game?
Let A = event that they will hit 3 HR in a game. Let B = event that they will hit 4 HR in a game. Let C = event that they will hit 5 HR in a game.. Let D = event that they will hit 6 or more HR in a game. Then P(A or B or C or D) = P(A) + P(B) + P(C) + P(D) = = 0.17 (17%) Note: A, B, C, D are disjoint events.

Remark

Graphical Representations of Discrete Probability Distributions
If a discrete probability distribution is given as a probability table: x P(x) --- then we construct a histogram (relative frequencies) from this table. This is called a probability histogram.

Example In the 2004 baseball season, Ichiro Suzuki of the Settle Mariners set the record for the most hits in a season with a total of 262 hits. Let X be the discrete random variable for the number of hits per game. The following table gives the probabilities of the number of hits per game by Suzuki. Here the number of hits per game is the random variable and the sum of the probabilities is 1. Construct the probability histogram. x P(x) 0.1677 1 0.3354 2 0.2857 3 0.1491 4 0.0373 5 or more 0.0248

Parameters of a Discrete Probability Distribution
Recall: A parameter of a population is a numerical characteristic of the population (e.g., mean, standard deviation, etc.). Definition: A parameter of a discrete probability distribution is a number that summarizes a characteristic of the distribution. Parameters: Mean Standard Deviation

The Mean of a Discrete Probability Distribution
x P(x) 0.10 1 0.01 2 0.51 3 0.38

Mean - Mean? Recall that we had defined the mean of a set of numbers (population) in a different way. Question: Is the mean of a population and the mean of a discrete probability distribution really the same thing? We consider the following example to show that they are the same.

Example Suppose that we consider a group of people and we ask them how many hours of TV have they watched during the past 24 hours. Suppose that there responses are summarized in the list: {2,4,6,6,4,4,2,3,5,5}. The mean of this set (population) is ( )/10 = 41/10 = 4.1. We can set up a probability table from this data consider the random variable X to be the number of hours of TV watching. Hence, x = 2,3,4,5,6. x P(x) 2 2/10 3 1/10 4 3/10 5 6 The mean of this discrete probability distribution is X = 2(0.2) + 3(0.1) + 4(0.3) + 5(0.2) + 6(0.2) = 4.1 and hence, we compute the same value! This is not by “chance.”

Expected Value of X The mean of a discrete probability distribution for X is a weight average of its probabilities P(xi) where xi are the weights. In a way, we can think of it as the value of P(X). This mean is the the average of all of the possible outcomes of X. For this reason it is also called the expected value of X. We can think of X as the mean outcome of the all events in X. That is, if we were to repeat the probability experiment many, many times, then the average of the outcomes would approach the expected value of X.

Example x (# of HR) P(x) 0.23 1 0.38 2 0.22 3 0.13 4 0.03 5 0.01 6 or more 0.00 Find the expected value of X where X is the random variable for the number of homeruns in a single game by the Boston Red Sox baseball team. X = 0(0.23) + 1(0.38) + 2(0.22) + 3(0.13) + 4(0.03) + 5(0.01) + 6(0.00) = 1.38

Example Suppose that you want to invest \$100 in the stock market. Let X be the random variable for the results of your \$100 investment. For simplicity, suppose that there are two possible outcomes: x = \$0 or \$1000 such that P(0) = 0.50 and P(1000) = What is the expected value of your investment? X = 0(0.5) (0.5) = 500 i.e., we have an expected return of \$500.

Standard Deviation of a Discrete Probability Distribution
The standard deviation of a probability distribution is a measure of its spread. In other words, it is the variation in xi from its mean, weighted by the probabilities.

Example Find the mean and standard deviation for the following discrete probability distribution. x P(x) 0.25 1 2 0.10 4 0.40 x = 0(0.25) + 1(0.25) + 2(0.1) +4(0.40) = 2.05 X = [02(0.25)+12(0.25)+22(0.10)+42(0.4) - (2.05)2]1/2 = 1.69

Comparing Discrete Probability Distributions
 =  = 0.64  =  = 1.12

The Binomial Probability Distribution
The binomial probability distribution is a discrete probability distribution that is used to determine probabilities of events in a probability experiment in which there are two mutual exclusive (disjoint) outcomes are possible. For example, the probability experiment might be flipping a coin. The two mutually exclusive outcomes are “head” or “tails.” Another example, the probability experiment might be one of asking an individual if he has a driver’s license. He or she “does” or “doesn’t.” Section 6.2

Binomial Probability Experiment
The experiment is performed a n times with each repetition of the experiment being a trial. The trials are independent. Each trial results in one of two mutually exclusive outcomes (success or failure). The probability of a success is p and the probability of failure is 1 - p, e.g., p = 0.5. A random variable for this type of probability experiment is the number of successes in the n trials. .

The Binomial Random Variable
We consider a binomial probability experiment. For each of the n trials of a binomial probability experiment, the probability of “success” is p. For each of the n trials of a binomial probability experiment, the probability of “failure” is 1 - p. Each of the n trials are independent i.e., the outcome of one trial does not dependent on any other trial. Let X be the number of successes in the n trials. We call X a binomial random variable.

Binomial Probability Distribution Function

Example According to the Uniform Crime Report, 2003, 66.9% of murders in the the U.S. were committed with a firearm. If 100 murders are randomly selected, how many would you expect be committed with a firearm? (b) What is the probability that you would observe 75 murders out of the 100 randomly selected murders?

Example Probability Experiment: Does a person have ESP?
A person in one room picks one the integers (1-5) at random and thinks about this particular number for 1 minute. In another room, the person who claims to have ESP tries to identify the number that was chosen by the person in the first room. The is done 3 times (3 trials). We assume that each trial is independent. The ESP claimant has the correct answer twice i.e., he or she is correct two out the three trials. Question: Does the person have ESP? Answer: If the person has ESP, then their success rate would be greater than guessing the number each trial. Hence, we must compute the probability that one could guess 2 out of the 3 correct answers?

Possible Outcomes of Guessing Three Times (a compound event)
S = success and F = failure to guess correctly. The probability of guessing the correct number is 1/5 = Hence, P(S) = 0.2 and P(F) = 1 - P(S) = 0.8 . Different Possible Outcomes Outcome Probability SSS (0.2)(0.2)(0.2) = 0.008 SSF (0.2)(0.2)(0.8) = 0.032 SFS (0.2)(0.8)(0.2) = 0.032 FSS (0.8)(0.2)(0.2) = 0.032 SFF (0.2)(0.8)(0.8) = 0.128 FSF (0.8)(0.2)(0.8) = 0.128 FFS (0.8)(0.8)(0.2) = 0.128 FFF (0.8)(0.8)(0.8) = 0.512 The probability of having any two successes is from the following: If A = SSF, B = SFS, C = FSS, then P(A or B or C) = P(A) + P(B) + P(C) = 3(0.032) = 0.096 since A, B and C are disjoint events. Hence, this will happen approximately 10% of the time. Hence, the ESP claimant has a success rate that is much higher than one would expect from guessing.

Calculation using Binomial Distribution Function

Binomial Distributions for n = 10 and Different p

Mean and Standard Deviation of Binomial Distributions

Example Example: Suppose the a random variable is distributed by the binomial distribution with n = 12 and p = Calculate the probability that x = 10. Find the mean and standard deviation.

Calculation on TI-83 2nd VARS (DISTR) key Select binompdf( [ENTER]
Complete entry e.g., binompdf(n,p,x) [ENTER] Remark: In binompdf, if x is omitted, then it calculates the Binomial Distribution for all of the possible values of x. The CDF of the Binomial Distribution can be calculated with binomcdf(n,p,{xk,xk+1,…,xm})

Example Question: Are women fairly treated in the selection for managerial training? Situation: A pool of 1,000 employees from which 10 will be selected. Of the 1,000 employees, 50% are women and 50% are men. Result: None of the 10 selected employees for management training were women. Question: Does this show bias against women? Analysis: Assuming that the 1,000 employees are equally qualified for training and there is equal chance of selecting woman or a man, what is the probability of selecting 10 male trainees? In the long-run, we would expect that for a sample of 10, 5 would be women and 5 would be men. We will assume that the 10 selected employees are distributed according to a Binomial Distribution (n = 10, p = 0.5). What does the probability distribution look like? Let “success” be the outcome of picking a female employee.

x P(x) 1 2 3 4 5 6 7 8 9 10 P(0) = which is very small and hence, picking no women is very unlikely. Note that the highest probability is when x = 5 i.e, P(5) = The expect value is (n)(p) = (10)(0.5) = 5.

Example Question: Can we check racial profiling by the police?
Situation: Statistics from Philadelphia Police Department in 1997 262 car stops Result: 207 of 262 involve African Americans i.e., 79% of total number of stops. Analysis: Assume that the car stops are binomially distributed with n = In Philadelphia, African Americans (AA) make up 42.2% of the population. Hence, we would expect that approximately 4 out 10 car stops would involve AA and we choose p = Let X be the binomial random variable be the event of stopping an AA. For the binary outcome, “success” is stopping of a car driven by an AA. Mean: X = (p)(n) = (0.422)(262) = Standard Deviation: X = [(n)(p)(1-p)]1/2 = [( )(0.5780]1/2 = 9.275

P(207) = 3.9 x 10-34 This is probability of stopping 207 AA out of the 262 stops. We would expect approximately 111 stops using the expected value. Hence, 207 out of 262 is very, very unlikely.

Bell-shaped Binomial Distributions
Remark: If a probability distribution is bell-shaped, then the Empirical Rule applies. Observation: If a binomial distribution is bell-shaped, then the Empirical Rule applies.

When is a Binomial Distribution Bell-shaped?

Example According to the Higher Education Research Institute, 55% of college freshmen in 4-year colleges and universities during the academic year were female. Suppose 12 freshmen are randomly selected and the number of female recorded. Find the following probabilities: (a) Exactly 7 of the 12 are female. (b) Five or more are female. (c) Fewer than five are female. (d) Between 7 and 10, inclusively, are female.

The Poisson Probability Distribution
We now introduce a new discrete random variable that counts the number of times that a particular event occurs in a particular set (time or space). For example, the number of bacterial per unit volume of a fluid the number of customers at McDonald’s that order “big Macs” between 4:00-5:00 PM the number of machines in a factory that break down per day. Definition: Let X be a random variable and suppose that the possible values of X are x = 0,1,2,3,… and these values occur in a fixed interval I (e.g., an interval of time). We say that the random variable follows a Poisson process if all of the following conditions hold: The probability of 2 or more occurrence of the event on any sufficiently small subinterval of I is zero. If we look at the random variable on two subintervals, I1 and I2, the probability of an event will be same, provide the two subintervals have the same length. For any two non-overlapping subintervals,I1 and I2, the number frequency of and event in I1 is independent of the frequency of the event on I2. Section 6.3

Poisson Probability Distribution Function

Graphs of the Poisson Distribution Function

Example The phone calls to a computer software help desk occur at the rate of 2.1 per minute between 11:00 AM and noon. Find the following probabilities for calls between 11:15-11:20 AM: There will be exactly 8 calls. There will be fewer than 8 calls. There will be at least 8 calls.

Example From 1900 to 2003 (104 years), the state of Florida suffered 24 major hurricanes (category 3 to 5). What is the probability that in the year 2009, the state will see 3 major hurricanes? What is the probability that it will see at most 3 major hurricanes?

Example The Vanderbilt Printing division duplicates documents for distribution in the university. For every 500 documents printed, one document is not acceptable for distribution. Documents are delivered in boxes with each box containing 100 documents. Assuming that the number of defective documents in a box is distributed according to a Poisson Distribution and we reject any box with two or more defective documents, what percent of the boxes are rejected?