# Discrete Random Variables

## Presentation on theme: "Discrete Random Variables"— Presentation transcript:

Discrete Random Variables

Probability Mass Function

Probability Mass Function Pr(RI)

Parameters of a distribution

Cumulative Distribution Function

Cumulative Distribution Function CDF(RI)

CDF (cont’d)

Expectation Value an average value is:
Discrete outcomes, like rolling dice an average value is:

Chebychev’s Inequality
In words: If X is any random variable with mean μ = E(X) and standard deviation σ then the probability it falls farther than k standard deviations of the mean, is less than or equal to 1/k2

Chebychev’s Inequality
k = 1, 2 or 3 1/k2 Chebychev held up under this simulation Random sample of 1000 measurements, with μ = 0, σ = 2.7

Binomial Probability Distribution
A binomial random variable X results from the following type of experiment. There is a fixed number, n, of trials. Each trials results in one of two possible outcomes, usually called ‘success’ or ‘failure. The trials are independent of one another. The probability, p, of success is the same for each trial. (This means we have a sequence of n bernoulli trials.) Then X is the number of successes. X has probability mass function The binomial distribution is implemented in R as: dbinom(x,n,p) for x = 0, 1, 2, 3, …

Binomial Probability Distribution
A binomially distributed random variable X, with parameters n (number of trials) and p (probability od a “success”) has Expected value: E(X) = np Variance: Var(X) = np(1 – p) For example a binomial with n = 15 and p = 0.3 has expected value 15*0.3 = 4.5 “successes” and standard deviation of sqrt(15*0.3*0.7) = 1.77 “successes”

Binomial distribution in R

Example A particular tennis racquet comes in a midsize version and an oversize version. 60% of customers at Shenkin Sports want the oversize version. Among 10 randomly selected customers who want this racquet what is the probability that at least 6 want the oversize version?

Example Suppose an army sniper hits his target 20% of the time at a distance of 1000 yards. Due to variations in wind speed and direction, time of day, and cloud cover successive shots may be considered to be independent of one another. Assuming the sniper takes 15 shots on a certain day how likely is it that he hits his target 0,1,2,3,…15 times. Binomial problem. n = 15 trials, probability of success on each trial p = 0.2, independence between trials, interest is in total number of hits. X= number of hits in 15 shots.

Calculations with the binomial distribution using R
Note that dbinom() is the probability mass function of the binomial. All code from file binomialExampleB(15,.2).R

Calculations with the binomial distribution using R-Continued
Note that pbinom() is the cumulative distribution function of the binomial

Calculations with the binomial distribution using R-Continued

Hypergeometric distribution
The binomial probability distribution is not technically applicable in a situation of sampling without replacement. The Hypergeometric distribution counts “successes” like the binomial distribution, but allows sampling without replacement. Consider a bucket of N objects. There are M “successes” and N – M “failures” in the bucket. Select n objects from the bucket without replacement The probability of drawing x “successes” in n tries is: for x = 0, 1, 2, 3, …

Hypergeometric Distribution
A hypergeometrically distributed random variable X, with parameters N (bucket size), M (number of “successes” in the bucket), n (number of trials) has Expected value: E(X) = np Variance:

Hypergeometric distribution 2
R provides the following built-in functions to calculate hypergeometric probabilities: dhyper is the density function defined on the previous slide. phyper is the CDF, qhyper the quantile function rhyper generates random hypergeometric observations. x, q number of “successes” drawn m total the number of “successes” in the bucket n total the number of “failures” in the bucket k the number of balls drawn from the urn. p probability of a “success”

Hypergeometric Properties
Each of 12 refrigerators of a certain type have been returned to the distributor because of a loud noise. Suppose that 7 of these have a defective compressor and 5 have a broken ice cube tray. The refrigerators will be examined one by one in random order. Let X be the number among the first 6 examined that have a defective compressor. Calculate: Pr(X =5) Pr(X <= 4) The expected value of X

Hypergeometric Properties
Each of 12 refrigerators of a certain type have been returned to the distributor because of a loud noise. Suppose that 7 of these have a defective compressor and 5 have a broken ice cube tray. The refrigerators will be examined one by one in random order. Let X be the number among the first 6 examined that have a defective compressor. Calculate: Pr(X =5)

Hypergeometric Properties
Each of 12 refrigerators of a certain type have been returned to the distributor because of a loud noise. Suppose that 7 of these have a defective compressor and 5 have a broken ice cube tray. The refrigerators will be examined one by one in random order. Let X be the number among the first 6 examined that have a defective compressor. Calculate: Pr(X <= 4) E(X) = np = n(M/N) = 6 (7/12) = 3.5 units on average

Hypergeometric Example
A college professor is teaching two sections of Math 301. Section 1 has 24 students registered and section 2 has 19 students registered. Each student hands in a term project. The professor grades these in a random order. What is the probability that exactly 4 out of the first 10 graded papers are from section 1? How about 4 or fewer of the first 10 papers are from section 1? m = 24, n = 19 and k =10 Exactly 4 4 or fewer

Geometric Distribution
Suppose we have a sequence of independent Bernoulli trials with probability of success p. Let X be the random variable representing the number of failures until the 1st success. Then X is a geometric random variable and the probability density of X is: Note that the R, dgeom(x,p) function represents the probability of the number of failures until the 1st success which is always one less than the number of trials until the 1st success.

Example Mr. S has 14 keys on his keychain. He is looking for the one key that opens his office. He randomly picks a key, tries it and replaces it until he gets his door open. What is the probability that he picks the correct key on the 6th try?

Geometric Distribution Example
A fair die is tossed until the first ‘3’ appears. What is the probability that the first ‘3’ appears on the 5th toss? On the rth toss? Solution: The first ‘3’ appears on the 5th toss only if the first 4 tosses are failures (not a ‘3’) and the 5th toss is a success: This probability is: In R we have Note the 4 representing the failure count before the 1st success. In general if the success probability is p, then the probability of the 1st success on the rth trial is:

Negative Binomial Distribution
A negative binomial random variable X, is the number of “failures” x that precede the rth “success”. The probability of a “success” is p. In contrast to the binomial random variable, the number of successes is fixed and the number of trials is random. for x = 0, 1, 2, 3, … Probability mass function Expectation Variance

Negative Binomial (cont’d)

Example If the probability of getting getting a head when flipping a coin is p = 0.3, what is the probability of getting 7 tails before the 3rd head?

Example The probability of successfully hailing a cab in the rain is p = 0.2. What is the probability that 15 hails must be attempted before 5 hails result in as driver pulling over? What is the probability that at most 10 hails must be attempted before 5 hails result in as driver pulling over?

Poisson process

Poisson distribution

Example During the Sandy disaster, NYC received about 20,000 calls/hr at its peak. What is the probability of exactly 3 calls in 1 hour? What is the probability of 3 or fewer calls in one hour? What is the probability of more than 3 calls in one hour? What is the probability of 7 calls in 2 hours?

Example During the Sandy disaster, NYC received about 20,000 calls/hr at its peak. What is the probability of exactly 20,165 calls in 1 hour?

Example During the Sandy disaster, NYC received about 20,000 calls/hr at its peak. What is the probability of or fewer calls in one hour?

Example During the Sandy disaster, NYC received about 20,000 calls/hr at its peak. What is the probability of more than calls in one hour?

Example During the Sandy disaster, NYC received about 20,000 calls/hr at its peak. What is the probability of calls in 2 hours?