# Introducing probability PSLS chapter 9 © 2009 W. H. Freeman and Company.

## Presentation on theme: "Introducing probability PSLS chapter 9 © 2009 W. H. Freeman and Company."— Presentation transcript:

Introducing probability PSLS chapter 9 © 2009 W. H. Freeman and Company

Objectives (PSLS chapter 9) Introducing probability  Randomness and probability  Probability models  Discrete vs continuous sample spaces  Probability rules  Random variables  Meaning of a probability  Risk and odds

A phenomenon is random if individual outcomes are uncertain, but there is nonetheless a regular distribution of outcomes in a large number of repetitions. Randomness and probability The probability of any outcome of a random phenomenon can be described as the proportion of times the outcome would occur in a very long series of repetitions, i.e., long-run relative frequency.

Example: Coin toss First series of tosses Second series Probability of heads is 0.5 = proportion of times you get heads in many repeated trials

Probability models mathematically describe the outcome of random processes. They consist of three parts: 1)S = Sample Space: This is a set, or list, of all possible outcomes of a random process. 2)Events: An event is a subset of the sample space. 3)A probability for each possible event in the sample space S. Probability models Example: Probability Model for a Coin Toss S = {Head, Tail} Events: e.g., {Head}, e.g., {Tail} Probability of heads = P{Head} = 0.5 Probability of tails = P[Tail} = 0.5

Important: It's the question that determines the sample space. Sample space A. A couple wants 3 children. What are the possible sequences of boys (B) and girls (G)? B B B - BBB G … G G - BBG B - BGB G - BGG … S = { BBB, BBG, BGB, BGG, GBB, GBG, GGB, GGG } Note: 8 elements, 2 3 B. A couple wants 3 children. What is the number of girls they could have? S = { 0, 1, 2, 3 } C. A researcher designs a new maze for lab rats. What are the possible outcomes for the time to finish the maze (in minutes)? S = ( 0, ∞] = (all numbers > 0)

Discrete variables contrast with continuous variables that can take on any one of an infinite number of possible values over an interval. Blood types For a random person: S = {O+, O-, A+, A-, B+, B-, AB+, AB-} and the probability of each event reflects, indeed is, the population relative frequencies or proportions. Probability = Population Proportion Discrete vs. continuous sample spaces Finite sample spaces deal with discrete variables that can take on only certain values (e.g. a whole number, i.e., a count, or a qualitative category).

Discrete vs. continuous sample spaces Categorical variables are necessarily discrete. E.g., the variable Color (red, green, …) has events such as {red}, {green}, …, or {Color = red}, {Color = green}, …, or {C = red}, {C = green} … Quantitative variables can be discrete or continuous. Discrete quantitative variables have only a finite or countably infinite set of values, usually counts, e.g., X = number of children in a family (0, 1, 2, …), with events of interest such as {X = 0}, {X = 1}, {X > 0}, {0 < X < 10}, … Continuous quantitative variables are uncountably infinite, and can take on any value in an interval, e.g., Y = age (years), W = weight (g). Events of interest are always intervals, e.g., {Y 21.5}. Singleton events such as {Y = 17.5} and {Y = 21.5} are never of interest, because continuous variables are always rounded off.

Density curves come in all imaginable shapes. Some are well known mathematically and others aren’t. Continuous sample spaces contain an uncountably infinite number of events. We use density curves to model continuous probability distributions.

Events are defined over intervals of values. Probability are computed as areas under the corresponding density curve. The total area under a density curve represents the whole population (sample space) and equals 1 (100%).

Shaded area = probability of drawing 1 individual at random with value between x 1 and x 2. Shaded area = proportion (%) of individuals in the population with values of X between x 1 and x 2. % individuals with X such that x 1 < X < x 2 x1x1 x2x2 P( x 1 < X < x 2 ) x1x1 x2x2  probability = relative frequency in population = population proportion

The probability of a single event is zero: P(y = 1) = (1 − 1)*1 = 0 The probability of a single-point event, e.g., {Y = 1}, is meaningless in a continuous sample space. Only intervals, e.g., {0 < Y < 0.5}, can have a non-zero probability, represented by the area under the density curve for that interval. Height = 1 y For the uniform distribution shown to the left, P(0 ≤ y ≤ 0.5) = (0.5 − 0)*1 = 0.5 P(0 < y < 0.5) = (0.5 − 0)*1 = 0.5 P(0 ≤ y < 0.5) = (0.5 − 0)*1 = 0.5

Probability of being type O+ P(O+) = 0.38 Probabilities range from 0 (no chance of the event) to 1 (the event has to happen). For any event A, 0 ≤ P(A) ≤ 1 Probability rules The probability of the complete sample space must equal 1. P(sample space) = 1 P(all blood types) =.38 +.07 +.34 +.06 +.09 +.02 +.03 +.01 = 1 The probability of an event not occurring is 1 minus the probability that is does occur. P(not A) = 1 – P(A) P(not A+) = 1 – P(A+) = 1 – 0.34 = 0. 66

Two events are disjoint if they have no outcomes in common and can never happen together. People may have black, brown, red, or blond hair. But a single individual can (naturally) only have one hair color. The events {black}, {brown}, {red}, and blond hair colors are all disjoint. A person can have both brown hair and blue eyes, or brown hair and brown eyes. Hair and eye colors are NOT disjoint. Events A and B are disjoint. Events A and B are NOT disjoint.

When two events A and B are disjoint, the probability that A OR B occurs is the sum of their individual probabilities. P(A or B) = “P(A U B)” = P(A) + P(B) This is the addition rule for disjoint events. The probability that a random person is “blood group O” is P(O) = P(O+ or O-) = P(O+) + P(O-) =.38 +.07 =.45 The probability that a random person is “rhesus neg" is: P(O- or A- or B- or AB-) =.07 +.02 +.06 +.01 = 0.16 A and B disjoint A and B NOT disjoint

General addition rule for any two events A and B: The probability that A occurs, or B occurs, or both events occur is: P(A or B) = P(A) + P(B) – P(A and B) The probability that a random person is either “blood group O” or “rhesus neg” is P(O or -)= P(O) + P(neg) - P(“O-”) =.45 +.16 -.07 =.54 (blood group and rhesus are NOT disjoint)

A couple wants 3 children.  What are the arrangements (ordered sequences) of boys (B) and girls (G)? Genetics tells us that the probability that a baby is a boy or a girl is the same, 0.5. → Sample space: {BBB, BBG, BGB, GBB, GGB, GBG, BGG, GGG} → All eight outcomes in the sample space are equally likely. → The probability of each is thus 1/8.  What are the numbers (X) of girls they could have? The same genetic laws apply. We can use the probabilities above to calculate the probability for each possible number of girls. → Sample space {0, 1, 2, 3} → P(X = 0) = P(BBB) = 1/8 → P(X = 1) = P(BBG or BGB or GBB) = P(BBG) + P(BGB) + P(GBB) = 3/8

We generate two random numbers between 0 and 1 and take Y to be their sum. Y can take any value between 0 and 2. The density curve for Y is: 012 Height = 1 because the base = 2, and the area under the curve has to equal 1 by definition. Area of a triangle = ½ (base*height) Probability that Y is greater than 1? P(Y > 1) = 0.5 Probability that Y is less than 0.5? P(Y < 0.5) = 0.125 Probability that Y is either less than 0.5 or greater than 1? P(Y 1) = 0.125 + 0.5 = 0.625 0 1 2 0.5 0.25 0.5 0.125 1.5 Y

Meaning of a probability  Theoretical probability  From understanding the phenomenon and symmetries in the problem  Six-sided fair die: Each side has the same chance of turning up; therefore, each has a probability 1/6.  Genetic laws of inheritance based on meiosis process.  Empirical probability  From our knowledge of numerous similar past events  Mendel discovered the probabilities of inheritance of a given trait from experiments on peas, without knowing about genes or DNA.  Predicting the weather: A 30% chance of rain today means that it rained on 30% of all days with similar atmospheric conditions.

 Personal (subjective) probability  From subjective considerations, typically about unique events  Probability of a large meteorite hitting the Earth. Probability of life on Mars. These do not make sense in terms of frequency. A personal probability represents an individual’s personal degree of belief based on prior knowledge.  We may say “there is a 40% chance of life on Mars.” In fact, either there is or there isn’t life on Mars. The 40% probability is our degree of belief, how confident we are about the presence of life on Mars based on what we know about life requirements, pictures of Mars, and probes we sent. Personal probabilities may be based on personal experiences, for instance a long time resident of a town may state that the probability of snow is 20% based on his/her long-time observations.

Risk and odds In the health sciences, probability concepts are often expressed in terms of risk and odds.  The risk of an undesirable outcome of a random phenomenon is the probability of that undesirable outcome. risk(event A) = P(event A)  The odds of any outcome of a random phenomenon is the ratio of the probability of that outcome over the probability of that outcome not occurring. odds(A) = P(event A) / [1 − P(event A)]

Sickle-cell anemia is a serious, inherited blood disease affecting the shape of red blood cells. Individuals carrying only one copy of the defective gene (“sickle- cell trait”) are generally healthy but may pass on the gene to their offspring. If a couple learns from prenatal tests that they both carry the sickle-cell trait, genetic laws of inheritance tell us that there is a 25% chance that they could conceive a child who will suffer from sickle-cell anemia. What are the corresponding risk and odds? The risk of conceiving a child who will suffer from sickle-cell anemia is the probability, so risk of sickle-cell anemia = 0.25. The odds is the ratio of two probabilities, so odds of sickle-cell anemia = 0.25/(1 − 0.25) = 0.333 Which can also be written as odds of 1 to 3 (1:3).