Presentation is loading. Please wait.

Presentation is loading. Please wait.

Probability Distributions

Similar presentations


Presentation on theme: "Probability Distributions"— Presentation transcript:

1 Probability Distributions

2 Week 2 Random Variables A random variable, X associates a unique numerical value with every outcome of an experiment. The value of the random variable will vary from trial to trial as the experiment is repeated. There are two types of random variable - discrete and continuous. A random variable has either an associated probability mass distribution (discrete random variable, PMF) or probability density function (continuous random variable, PDF).

3 Random variables can be discrete or continuous
Discrete random variables have a countable number of outcomes Examples: Dead/alive, treatment/placebo, dice, counts, etc. Continuous random variables have an infinite continuum of possible values. Examples: blood pressure, concentration, weight, the speed of a car, the real numbers from 1 to 6.

4 What is a probability distribution?
The set of probabilities for the possible outcomes of a random variable is called a “probability distribution.” A probability distribution is a statistical function that describes all the possible values that a random variable can take within a given range.

5 What is a probability distribution?
Let X be the number obtained in throwing a fair die. The corresponding probability function is: 𝑓 𝑥 = 1 6 ∑𝑃 𝑋=𝑥 =1 𝑓 𝑥 Heights are actual probabilities 𝑥

6 What is a probability distribution?
Let X be the sum of the two numbers obtained in throwing two fair dice. The corresponding probability mass function is: ∑𝑃 𝑋=𝑥 =1 𝑓 𝑥 Heights are actual probabilities 𝑥

7 Probability Mass Function
Definition: p 𝑥 is a probability mass function for the discrete random variable X if, for all 𝑥 𝑝(𝑥) =1 𝑝 𝑥 ≥0

8 The Mean or Expected Value of a PMF
Probability Mass Functions have means given by: 𝜇=𝐸 𝑋 = 𝑥 𝑖 𝑓( 𝑥 𝑖 ) For example, for the single die: 𝜇= =3.5 Expected Value = Long range average

9 Variance of a discrete distribution
𝜎 2 =𝑉𝑎𝑟 𝑋 = (𝑥 𝑖 − 𝜇) 2 𝑓( 𝑥 𝑖 )

10 Cumulative Distribution Function (CDF)
CDF – cumulative (probability) distribution function; assigns the sum of probabilities less than or equal to X

11 Cumulative distribution function (CDF)
x P(x) 1/6 1 4 5 6 2 3 1/3 1/2 2/3 5/6 1.0

12 Cumulative distribution function
x P(x≤A) 1 P(x≤1)=1/6 2 P(x≤2)=2/6 3 P(x≤3)=3/6 4 P(x≤4)=4/6 5 P(x≤5)=5/6 6 P(x≤6)=6/6

13 Discrete Distributions
Yes-No responses Bernoulli Distribution Binomial Distribution Geometric Distribution Poisson Distribution Sums of Bernoulli responses Number of trials until first success Points in time or space

14 Bernoulli Distribution
A Bernoulli event is one for which the probability the event occurs is p and the probability the event does not occur is 1-p; i.e., the event is has two possible outcomes usually viewed as success or failure. A Bernoulli trial is an instantiation of a Bernoulli event. So long as the probability of success or failure remains the same from trial to trial (i.e., each trial is independent of the others), a sequence of Bernoulli trials is called a Bernoulli process.

15 Bernoulli Distribution
A Bernoulli distribution is the pair of probabilities of a Bernoulli event where success (1) and failure (0) have probabilities: Expectation: Variance:

16 Binomial Distribution
Let x be the number of “successes” in n trials. x is said to be a binomially distributed provided: The trials are identical and independent The number of trials is fixed, n Each trial results in one of two possible outcomes success or failure 4. The probability of success on a single trial is p, and is constant from trial to trial x ~ B (n,p)

17 Binomial Distribution
The binomial distribution is just n independent Bernoullis added up. It is the number of “successes” in n trials. If Z1, Z2, …, Zn are Bernoulli, then X is binomial

18 Binomial Distribution
Bernoulli Distribution For the case when n = 1, the distribution is called the Bernoulli Distribution. x ~ B (1,p)

19 Binomial Distribution
There are n independent trials of the experiment Let p denote the probability of success and then 1 – p is the probability of failure Let x denote the number of successes in n independent trials of the experiment. So 0 ≤ x ≤ n

20 Binomial PMF vs CDF Abbreviation for binomial distribution is B(n,p)
A binomial pmf function gives the probability of a random variable equaling a particular value, i.e., P(x=2) A binomial cdf function gives the probability of a random variable equaling that value or less , i.e., P(x ≤ 2) P(x ≤ 2) = P(x=0) + P(x=1) + P(x=2)

21 Binomial PMF n = number of trials
The probability of obtaining x successes in n independent trials of a binomial experiment, where the probability of success is p, is given by: is also called a binomial coefficient and is the number of combinations of n items taken x at a time. n = number of trials x = number of successes – x axis of the distribution p = probability of success

22 These are all equivalent

23 Examples Suppose you independently flip a coin 4 times. What is the probability of obtaining exactly 2 tails? Number of trials = 4, x = 2, p = 0.5 2. A roulette wheel has 38 slots (US version). What is the probability of winning twice in 50 spins?

24 What does the Binomial Distribution Look Like?

25 Mean and Variance for the Binomial Distrib
Mean or Expected Value 𝜇=𝑛 𝑝 Variance 𝜎 2 =𝑛𝑝𝑞

26 Exercise Plot various binomial distributions for: n = 5 and p = 0.1, 0.2, 0.45, 0.8, 0.9 n = 50 and p = 0.1, 0.2, 0.45, 0.8, 0.9

27 Binomial Distribution Experiment
Basic Experiment: 5 fair coins are tossed. Event of interest: total number of heads. The probability of heads coming up (a success) is equal to 0.5. So the number of heads in the five coins is a binomial random variable with n=5 and p=0.5. The Experiment is repeated 50 times. Each group (or two people) throws coins twice and collate the data on the board.

28 Binomial Distribution Example
Basic Experiment: 5 fair coins are tossed. Event of interest: total number of heads. The probability of heads coming up (a success) is equal to 0.5. So the number of heads in the five coins is a binomial random variable with n=5 and p=0.5. 20 18 16 14 12 10 8 6 4 2 The Experiment is repeated 50 times. # of heads Observed Theoretical

29 Discrete Distributions
Week 3 Discrete Distributions Yes-No responses. Bernoulli Distribution Binomial Distribution Geometric Distribution Poisson Distribution Sums of Bernoulli responses Number of trials until first success Points in given time or space

30 Geometric Distribution
The probability of transfecting a cell line is 30%. The probability of failure = 70% Let us calculate the probability of success in a given set of transfections. Experiment 1 Probability of Success 1 0.3 = 0.3 2 0.7 x 0.3 = 0.21 3 0.7 x 0.7 x 0.3 = 0.147 4 0.7 x 0.7 x 0.7 x 0.3 = …. N (0.7)^N-1 x 0.3

31 Geometric Distribution – what does it mean?
Experiment 1 Probability of Success 1 0.3 = 0.3 2 0.7 x 0.3 = 0.21 3 0.7 x 0.7 x 0.3 = 0.147 4 0.7 x 0.7 x 0.7 x 0.3 = …. N (0.7)^N-1 x 0.3 (Tends to zero for large N) At first glance it might appear that the more attempts you make at a transfection the less likely it will work. This is NOT what it is saying and in fact this statement makes no sense. What the distribution tells us is the probability of the first transfection on the kth experiment. In other words it is more and more unlikely that the first transfection will happen in later experiments than earlier ones.

32 Geometric Distribution
If a single event or trial has two possible outcomes (say X can be 0 or 1 with P(X=1)=p), the probability of having to observe k trials before the first "one" appears is given by the geometric distribution. Before we can succeed at trial k, we must first have had k-1 failures Each failure occurs with probability q, so there is a term with: qk-1 Finally, a single success occurs with probability p, so there is a term: p1

33 Geometric Distribution
Mean: Variance:

34 x represents the number of trials required to get a success
Geometric Distribution Geometric distribution for p = 3/10 x represents the number of trials required to get a success

35 Geometric Distribution – CDF
Experiment 1 Probability of Success Probability of Success or Earlier (CDF) 1 0.3 = 0.3 0.7 2 0.7 x 0.3 = 0.21 0.51 3 0.7 x 0.7 x 0.3 = 0.147 0.657 4 0.7 x 0.7 x 0.7 x 0.3 = 0.7599 …. N (0.7)^N-1 x 0.3 (Tends to zero for large N) Tends to 1.0 Another way of looking at this to look at probability of success at a given trial N, or earlier To do this we must compute the cumulative distribution function, CDF. Without providing a proof, the CDF for the geometric distribution is given by: For example: There is an 76 % chance that after 4 attempts we will succeed How many experiments must we do so that there is a 99% chance we will succeed at transfecting the cells? 0.99 = 1 – ( )^N Solve for N (Answer: 12.9 experiments)

36 Class Problem Many of the pharmaceuticals on the market today were found using high throughput screening assays. In these assays, large numbers of random molecules are tested and of these only a few show appreciable activity. If we assume that the success rate in these screens is one in ten thousand (p=0.0001), then how large of a library do we need to be 99% sure that we will find at least one active molecule?

37 Poisson Distribution Week 3

38 Poisson Distribution The Poisson distribution arises in two important instances: 1) It is an approximation to the binomial distribution when n is large and p is small. 2) The Poisson describes the number of events that will occur in a given time period when the events occur randomly and are independent of one another. Similarly, the Poisson distribution describes the number of events in a given area when the presence or absence of a point is independent of occurrences at other points.

39 Poisson Distribution Examples: Events per unit time
Telephone called received in a hour Articles received in a day at an airline’s lost and found Car accidents in a month at a busy intersection Deaths per month due to a rare disease Events per unit distance Defects occurring in 50 meters of insulated wire Deaths per 10,000 passenger miles Events per unit area Bacteria per square centimeter of culture plate Events per unit volume White blood cells in a cubic millimeter of blood Hydrogen atoms per cubic light-year in intergalactic space

40 Poisson Distribution X=number of occurrences of event in a given time, distance, area, volume etc. The probability an event occurs in the interval is proportional to the length of the interval. An infinite number of occurrences are possible. Events occur independently at a rate .

41 Poisson Distribution - Application
Poisson distribution is applied where random events in space or time are expected to occur Deviation from Poisson distribution may indicate some degree of non-randomness in the events under study Investigation of cause may be of interest

42 Poisson Distribution Source:

43 Poisson Distribution For the Poisson one parameter:  Mean:
Standard Deviation: Variance:

44 Poisson Distribution - Example
In a small US town the number of accidents per year is 2.4. a) What is the probability that in any particular year there will be no accidents? b) What is the probability that in any particular year there will be 5 accidents? 𝜆 = 2.4 accidents per year

45 Poisson Distribution Poisson with  =2.4

46 Poisson Distribution - Example
What is the probability that that there will be more than 4 accidents per year? 𝜆 = 2.4 accidents per year

47 The Poisson Distribution Emission of -particles
Rutherford, Geiger, and Bateman (1910) counted the number of -particles emitted by a film of polonium in 2608 successive intervals of one-eighth of a minute. They counted 10,097 alpha particles. Do their data follow a Poisson distribution?

48 The Poisson Distribution Emission of -particles
Calculation of 𝜆 : 𝜆 = No. of particles per interval = 10097/2608 = 3.87 Expected values: 2680  P(x) = e -3.87(3.87)x x! 2608 

49 The Poisson Distribution Emission of -particles
The fact that the observed and expected align very closely tells us what?

50 The Poisson Distribution Emission of -particles
Random events Regular events Clumped events

51 The Poisson Distribution
If there are 3 x 109 base pairs in the human genome and the mutation rate per generation per base pair is 10-9, what is the mean number of new mutations that a child (=genome) will have, what is the variance in this number, and what will the distribution look like?

52 The Poisson Distribution

53 Vocabulary Discrete – Data that can only take on set number of values
Continuous - Quantitative data that can take on any value between the minimum and maximum, and any value between two other values Trial – each repetition of an experiment Success – one assigned result of a binomial experiment Failure – the other result of a binomial experiment PDF – probability distribution function; assigns a probability to each value of X CDF – cumulative (probability) distribution function; assigns the sum of probabilities less than or equal to X Binomial Coefficient – combination of k success in n trials


Download ppt "Probability Distributions"

Similar presentations


Ads by Google