Probability. I am offered two lotto cards: –Card 1: has numbers –Card 2: has numbers Which card should I take so that I have the greatest chance of winning.

Slides:



Advertisements
Similar presentations
Comparing Two Proportions (p1 vs. p2)
Advertisements

C4, L2, S1 Probabilities and Proportions Probabilities and proportions are numerically equivalent. (i.e. they convey the same information.) e.g. The proportion.
Probability Simple Events
Presentation 5. Probability.
COUNTING AND PROBABILITY
How likely something is to happen.
Mathematics in Today's World
MM207 Statistics Welcome to the Unit 7 Seminar Prof. Charles Whiffen.
MAT 103 Probability In this chapter, we will study the topic of probability which is used in many different areas including insurance, science, marketing,
Excursions in Modern Mathematics, 7e: Copyright © 2010 Pearson Education, Inc. 15 Chances, Probabilities, and Odds 15.1Random Experiments and.
Chapter 4 Using Probability and Probability Distributions
1 Probability Part 1 – Definitions * Event * Probability * Union * Intersection * Complement Part 2 – Rules Part 1 – Definitions * Event * Probability.
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 6.1 Chapter Six Probability.
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 4-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Probability.
Chris Morgan, MATH G160 January 18, 2012 Lecture 4 Chapter 4.4: Independence 1.
Applying the ideas: Probability
Agresti/Franklin Statistics, 1 of 87 Chapter 5 Probability in Our Daily Lives Learn …. About probability – the way we quantify uncertainty How to measure.
Probability. I am offered two lotto cards: –Card 1: has numbers –Card 2: has numbers Which card should I take so that I have the greatest chance of winning.
Mathematics in Today's World
Chapter 6 Probability.
Copyright ©2005 Brooks/Cole, a division of Thomson Learning, Inc. Understanding Probability and Long-Term Expectations Chapter 16.
Chapter 4 Probability See.
Probability and Long- Term Expectations. Goals Understand the concept of probability Grasp the idea of long-term relative frequency as probability Learn.
5.1 Basic Probability Ideas
1 9/8/2015 MATH 224 – Discrete Mathematics Basic finite probability is given by the formula, where |E| is the number of events and |S| is the total number.
C4, L1, S1 Probabilities and Proportions. C4, L1, S2 I am offered two lotto cards: –Card 1: has numbers –Card 2: has numbers Which card should I take.
Agresti/Franklin Statistics, 1 of 87 Chapter 5 Probability in Our Daily Lives Learn …. About probability – the way we quantify uncertainty How to measure.
Chapter 4 Probability 4-1 Overview 4-2 Fundamentals 4-3 Addition Rule
Lecture Slides Elementary Statistics Twelfth Edition
C4, L1, S1 Chapter 2 Probability. C4, L1, S2 I am offered two lotto cards: –Card 1: has numbers –Card 2: has numbers Which card should I take so that.
Analysis of Categorical Data. Types of Tests o Data in 2 X 2 Tables (covered previously) Comparing two population proportions using independent samples.
History of Probability Theory
©The McGraw-Hill Companies, Inc. 2008McGraw-Hill/Irwin A Survey of Probability Concepts Chapter 5.
C4, L1, S1 Chapter 3 Probability. C4, L1, S2 I am offered two lotto cards: –Card 1: has numbers –Card 2: has numbers Which card should I take so that.
Please turn off cell phones, pagers, etc. The lecture will begin shortly.
Copyright © 2010 Pearson Education, Inc. Chapter 6 Probability.
5.1 Probability in our Daily Lives.  Which of these list is a “random” list of results when flipping a fair coin 10 times?  A) T H T H T H T H T H 
MM207 Statistics Welcome to the Unit 7 Seminar With Ms. Hannahs.
Rules of Probability. Recall: Axioms of Probability 1. P[E] ≥ P[S] = 1 3. Property 3 is called the additive rule for probability if E i ∩ E j =
1 Multivariable Modeling. 2 nAdjustment by statistical model for the relationships of predictors to the outcome. nRepresents the frequency or magnitude.
Natural Language Processing Giuseppe Attardi Introduction to Probability IP notice: some slides from: Dan Jurafsky, Jim Martin, Sandiway Fong, Dan Klein.
Math 30-2 Probability & Odds. Acceptable Standards (50-79%)  The student can express odds for or odds against as a probability determine the probability.
Measuring chance Probabilities FETP India. Competency to be gained from this lecture Apply probabilities to field epidemiology.
Probability. What is probability? Probability discusses the likelihood or chance of something happening. For instance, -- the probability of it raining.
STT 315 This lecture note is based on Chapter 3
+ Chapter 5 Overview 5.1 Introducing Probability 5.2 Combining Events 5.3 Conditional Probability 5.4 Counting Methods 1.
1 Chapter 4, Part 1 Basic ideas of Probability Relative Frequency, Classical Probability Compound Events, The Addition Rule Disjoint Events.
Chapter 4 Probability Concepts Events and Probability Three Helpful Concepts in Understanding Probability: Experiment Sample Space Event Experiment.
6.3 Binomial and Geometric Random Variables
C4, L2, S1 Probabilities and Proportions Probabilities and proportions are numerically equivalent. (i.e. they convey the same information.) e.g. The proportion.
Chance We will base on the frequency theory to study chances (or probability).
The Law of Averages. What does the law of average say? We know that, from the definition of probability, in the long run the frequency of some event will.
Probability. Definitions Probability: The chance of an event occurring. Probability Experiments: A process that leads to well- defined results called.
1 Copyright © 2014, 2012, 2009 Pearson Education, Inc. Chapter 9 Understanding Randomness.
4.5 through 4.9 Probability continued…. Today’s Agenda Go over page 158 (49 – 52, 54 – 58 even) Go over 4.5 and 4.6 notes Class work: page 158 (53 – 57.
Essential Ideas for The Nature of Probability
Experimental Probability vs. Theoretical Probability
A Survey of Probability Concepts
Chapter 4 Probability Concepts
A Survey of Probability Concepts
Probability.
5.1 Probability of Simple Events
Natural Language Processing
Applicable Mathematics “Probability”
Introduction to Probability
Basic Concepts An experiment is the process by which an observation (or measurement) is obtained. An event is an outcome of an experiment,
Probability Probability underlies statistical inference - the drawing of conclusions from a sample of data. If samples are drawn at random, their characteristics.
Probabilities and Proportions
STATISTICS AND PROBABILITY
Presentation transcript:

Probability

I am offered two lotto cards: –Card 1: has numbers –Card 2: has numbers Which card should I take so that I have the greatest chance of winning lotto? Lotto

In the casino I wait at the roulette wheel until I see a run of at least five reds in a row. I then bet heavily on a black. I am now more likely to win. Roulette

Coin Tossing I am about to toss a coin 20 times. What do you expect to happen? Suppose that the first four tosses have been heads and there are no tails so far. What do you expect will have happened by the end of the 20 tosses ?

Coin Tossing Option A –Still expect to get 10 heads and 10 tails. Since there are already 4 heads, now expect to get 6 heads from the remaining 16 tosses. In the next few tosses, expect to get more tails than heads. Option B –There are 16 tosses to go. For these 16 tosses I expect 8 heads and 8 tails. Now expect to get 12 heads and 8 tails for the 20 throws.

In a TV game show, a car will be given away. –3 keys are put on the table, with only one of them being the right key. The 3 finalists are given a chance to choose one key and the one who chooses the right key will take the car. –If you were one of the finalists, would you prefer to be the 1st, 2nd or last to choose a key? TV Game Show

Let’s Make a Deal Game Show You pick one of three doors –two have booby prizes behind them –one has lots of money behind it The game show host then shows you a booby prize behind one of the other doors Then he asks you “Do you want to change doors?” –Should you??! (Does it matter??!) See the following website:

Game Show Dilemma Suppose you choose door A. In which case Monty Hall will show you either door B or C depending upon what is behind each. No Switch Strategy ~ here is what happens Result A B C WinCarGoat LoseGoatCarGoat LoseGoat Car P(WIN) = 1/3

Game Show Dilemma Suppose you choose door A, but ultimately switch. Again Monty Hall will show you either door B or C depending upon what is behind each. Switch Strategy ~ here is what happens Result A B C LoseCarGoat WinGoatCarGoat WinGoat Car Monty will show either B or C. You switch to the one not shown and lose. Monty will show door C, you switch to B and win. Monty will show door B, you switch to C and win. P(WIN) = 2/3 !!!!

Matching Birthdays In a room with 23 people what is the probability that at least two of them will have the same birthday? Answer:.5073 or 50.73% chance!!!!! How about 30?.7063 or 71% chance! How about 40?.8912 or 89% chance! How about 50?.9704 or 97% chance!

Probability What is Chapter 6 trying to do? –Introduce us to basic ideas about probabilities: what they are and where they come from simple probability models (genetics) conditional probabilities independent events Baye’s Rule Teach us how to calculate probabilities: tables of counts and using properties of probabilities such as independence.

Probability I toss a fair coin (where fair means ‘equally likely outcomes’) What are the possible outcomes? Head and tail ~ This is called a “dichotomous experiment” because it has only two possible outcomes. S = {H,T}. What is the probability it will turn up heads? 1/2 I choose a patient at random and observe whether they are successfully treated. What are the possible outcomes? “Success” and “Failure” What is the probability of successful treatment? ????? What factors influence this probability? ?????

What are Probabilities? A probability is a number between 0 & 1 that quantifies uncertainty. A probability of 0 identifies impossibility A probability of 1 identifies certainty

Where do probabilities come from? Probabilities from models: The probability of getting a four when a fair dice is rolled is 1/6 ( or 16.7% chance)

Probabilities from data or Empirical probabilities What is the probability that a randomly selected patient is successfully treated? –In a clinical trial n = 67 patients are “randomly” selected. –40 of these patients are successfully treated. –The estimated probability that a randomly chosen patient will have a successful outcome is 40/67 (0.597 or 59.7% chance) Where do probabilities come from?

Subjective Probabilities –The probability that there will be another outbreak of ebola in Africa within the next year is 0.1. –The probability of rain in the next 24 hours is very high. Perhaps the weather forecaster might say a there is a 70% chance of rain. –A doctor may state your chance of successful treatment. Where do probabilities come from?

For equally likely outcomes, and a given event A: Simple Probability Models “The probability that an event A occurs” is written in shorthand as P(A). P(A) = Number of outcomes in A Total number of outcomes

1. Heart Disease In 1996, 6631 Minnesotans died from coronary heart disease. The numbers of deaths classified by age and gender are: Sex AgeMaleFemaleTotal < > Total

Let A be the event of being under 45 B be the event of being male C be the event of being over Heart Disease Sex AgeMaleFemaleTotal < > Total

Find the probability that a randomly chosen member of this population at the time of death was: a)under 45 P(A) = 92/6631 = Heart Disease Sex AgeMaleFemaleTotal < > Total

Conditional Probability We wish to find the probability of an event occuring given information about occurrence of another event. For example, what is probability of developing lung cancer given that we know the person smoked a pack of cigarettes a day for the past 30 years. Key words that indicate conditional probability are: “given that”, “of those”, “if …”, “assuming that”

“The probability of event A occurring given that event B has already occurred” is written in shorthand as P(A|B) Conditional Probability

P(A|B) =__________, P(B) > 0 Conditional Probability and Independence P(A and B) P(B) Two events A and B are said to be independent if P(A|B) = P(A) and P(B|A) = P(B) i.e. knowing the occurrence of one of the events tells you nothing about the occurrence of the other.

1. Heart Disease Sex AgeMaleFemaleTotal < > Total Find the probability that a randomly chosen member of this population at the time of death was: b)male assuming that the person was younger than 45.

Sex AgeMaleFemaleTotal < > Total Find the probability that a randomly chosen member of this population at the time of death was: b)male given that the person was younger than 45. P(B|A) = 79/92 = Heart Disease P(B|A) = P(A and B)/P(A) = (79/6631)/(92/6631) = 79/92

Sex AgeMaleFemaleTotal < > Total Find the probability that a randomly chosen member of this population at the time of death was: c)male and was over 64. P(B and C) = ( )/6631= 2876/6631= Heart Disease

Sex AgeMaleFemaleTotal < > Total Find the probability that a randomly chosen member of this population at the time of death was: d) over 64 given they were female (not B). 1. Heart Disease

Sex AgeMaleFemaleTotal < > Total P(C|not B) = ( )/2904 = Heart Disease Find the probability that a randomly chosen member of this population at the time of death was: d) over 64 given they were female (not B).

2. Hodgkin’s Disease Type NonePartialPositive Row Totals LD LP MC NS Column Totals n = 538 Response to Treatment

2. Hodgkin’s Disease

Type NonePartialPositive Row Totals LD LP MC NS Column Totals n = 538 Response to Treatment a)Had positive response to treatment P(pos) = 314/538 =.584 or 58.4% chance

2. Hodgkin’s Disease Type NonePartialPositive Row Totals LD LP MC NS Column Totals n = 538 Response to Treatment b)Had at least some response to treatment P(par or pos) = ( )/538 = 412/538 =.766 or 76.6% chance

2. Hodgkin’s Disease Type NonePartialPositive Row Totals LD LP MC NS Column Totals n = 538 Response to Treatment c)Had LP and positive response to treatment P(LP and pos) = 74/538 =.138 or 13.8%

2. Hodgkin’s Disease Type NonePartialPositive Row Totals LD LP MC NS Column Totals n = 538 Response to Treatment d)Had LP or NS as there histological type. P(LP or NS) = ( )/538 =.372 or 37.2% chance

2. Hodgkin’s Disease Type NonePartialPositive Row Totals LD LP MC NS Column Totals n = 538 Response to Treatment What conditional probabilities would be of interest? EXAMPLES IN NOTES

3. Right Heart Catheterization and 30-day Mortality (Conners, et al. 1996) Catheter? YESNO Row Totals RHC No RHC Column Totals RHC = patient had catheter put in No RHC = patient did not have catheter YES = Died within 30 days NO = Survived 30 days P(YES) = 1918 / 5735 =.3344 or 33.44% What is the probability that a heart patient in this study died? Died within 30 days?

3. Right Heart Catheterization and 30-day Mortality (Conners, et al. 1996) Catheter? YESNO Row Totals RHC No RHC Column Totals RHC = patient had catheter put in No RHC = patient did not have catheter YES = Died within 30 days NO = Survived 30 days P(RHC) = 2184 / 5735 =.3808 or 38.08% What is the probability that a heart patient had a right heart catheter put in during treatment? Died within 30 days?

3. Right Heart Catheterization and 30-day Mortality (Conners, et al. 1996) Catheter? YESNO Row Totals RHC No RHC Column Totals RHC = patient had catheter put in No RHC = patient did not have catheter YES = Died within 30 days NO = Survived 30 days P(YES | RHC) = 830 / 2184 =.3800 or 38.00% What is the probability that a patient would die within 30 days given that they had a right heart catheter put in? Died within 30 days?

3. Right Heart Catheterization and 30-day Mortality (Conners, et al. 1996) Catheter? YESNO Row Totals RHC No RHC Column Totals RHC = patient had catheter put in No RHC = patient did not have catheter YES = Died within 30 days NO = Survived 30 days P(YES | No RHC) = 1088 / 3551 =.3064 or 30.64% What is the probability that a patient would die within 30 days given that they did not have a right heart catheter put in? Died within 30 days?

3. Right Heart Catheterization and 30-day Mortality (Conners, et al. 1996) How many times more likely is a patient who had a right heart catheter put in to die within 30 days than patient who did not have a Swan-Ganz line put in? P(YES | RHC) =.3800 P(YES | No RHC) = /.3064 = 1.24 times more likely. This is called the relative risk or risk ratio (denoted RR). Risk of death is 24% greater for those that had a Swan-Ganz line put in.

3. Right Heart Catheterization and 30-day Mortality (Conners, et al. 1996) The shading for 30-day mortality is 1.24 times higher for the RHC group than for the No RHC group (recall RR = 1.24). Patients having a Swan-Ganz line put in have 1.24 times higher risk of death within 30- days of initial treatment.

Building a Contingency Table from a Story 4. HIV Example A European study on the transmission of the HIV virus involved 470 heterosexual couples. Originally only one of the partners in each couple was infected with the virus. There were 293 couples that always used condoms. From this group, 3 of the non-infected partners became infected with the virus. Of the 177 couples who did not always use a condom, 20 of the non- infected partners became infected with the virus.

Let C be the event that the couple always used condoms. (NC be the complement) Let I be the event that the non-infected partner became infected. (NI be the complement) CNC NI I 4. HIV Example Total Condom Usage Infection Status

A European study on the transmission of the HIV virus involved 470 heterosexual couples. Originally only one of the partners in each couple was infected with the virus. There were 293 couples that always used condoms. From this group, 3 of the non-infected partners became infected with the virus. CNC NI I 4. HIV Example Total Condom Usage Infection Status

Of the 177 couples who did not always use a condom, 20 of the non-infected partners became infected with the virus. CNC NI I 4. HIV Example Total Condom Usage Infection Status

a)What proportion of the couples in this study always used condoms? CNC NI I Total Condom Usage Infection Status HIV Example P(C )

a)What proportion of the couples in this study always used condoms? CNC NI I Total Condom Usage Infection Status HIV Example P(C ) = 293/470 (= 0.623)

b)If a non-infected partner became infected, what is the probability that he/she was one of a couple that always used condoms? 4. HIV Example CNC NI I Total Condom Usage Infection Status P(C|I ) = 3/23 = 0.130

4. HIV Example c) In what percentage of couples did the non- HIV partner become infected amongst those that did not use condoms? P( I | NC ) = 20/177 =.113 or 11.3% Amongst those that did where condoms? P( I | C ) = 3/293 =.0102 or 1.02% What is relative risk of infection associated with not wearing a condom? RR = P( I | NC ) / P( I | C ) = times more likely to become infected.

4. HIV Example The percentage of couples where the non-HIV partner became infected in the non-condom user group is 11 times higher than that for condom group.

Relative Risk (RR) and Odds Ratio (OR) Example: Age at First Pregnancy and Cervical Cancer A case-control study was conducted to determine whether there was increased risk of cervical cancer amongst women who had their first child before age 25. A sample of 49 women with cervical cancer was taken of which 42 had their first child before the age of 25. From a sample of 317 “similar” women without cervical cancer it was found that 203 of them had their first child before age 25. Q: Do these data suggest that having a child at or before age 25 increases risk of cervical cancer?

Relative Risk (RR) and Odds Ratio (OR) The ODDS for an event A are defined as Odds for A = _______ P(A) 1 – P(A) For example suppose we roll a single die the odds for a 3 are: Odds for 3 = P(3)/(1 – P(3)) = = (1/6)/(1 – (1/6)) = 1/5 1 three for every 5 rolls that don’t result in a six. (Odds for a 3 are 1:5 and odds against are 5:1)

Relative Risk (RR) and Odds Ratio (OR) The Odds Ratio (OR) for a disease associated with a risk factor is ratio of the odds for disease for those with risk factor and the odds for disease for those without the risk factor OR = _________________________ P(Disease|Risk Factor) 1 – P(Disease|Risk Factor) _____________________ P(Disease|No Risk Factor) 1 – P(Disease|No Risk Factor) _______________________ The Odds Ratio gives us the multiplicative increase in odds associated with having the “risk factor”. Odds for disease amongst those with risk factor present Odds for disease amongst those without the risk factor.

Relative Risk (RR) and Odds Ratio (OR) Age at 1 st Pregnancy CaseControl Row Totals Age < Age > Column Totals49317 n = 366 Cervical Cancer a) Why can’t we calculate P(Cervical Cancer | Age < 25)? Because the number of women with disease was fixed in advance and therefore NOT RANDOM !

Relative Risk (RR) and Odds Ratio (OR) Age at 1 st Pregnancy CaseControl Row Totals Age < Age > Column Totals49317 n = 366 Cervical Cancer b) What is P(risk factor|disease status) for each group? P(Age < 25|Case) = 42/49 =.857 or 85.7% P(Age < 25|Control) = 203/317 =.640 or 64.0%

Relative Risk (RR) and Odds Ratio (OR) Age at 1 st Pregnancy CaseControl Row Totals Age < Age > Column Totals49317 n = 366 Cervical Cancer c) What are the odds for the risk factor amongst the cases? Amongst the controls? Odds for risk factor cases =.857/(1-.857) = 5.99 Odds for risk factor controls =.64/(1-.64) = 1.78

Relative Risk (RR) and Odds Ratio (OR) Age at 1 st Pregnancy CaseControl Row Totals Age < Age > Column Totals49317 n = 366 Cervical Cancer d) What is the odds ratio for the risk factor associated with being a case? Odds Ratio (OR) = 5.99/1.78 = 3.37, the odds for having 1 st child on or before age 25 are 3.37 times higher for women who currently have cervical cancer versus those that do not have cervical cancer.

Relative Risk (RR) and Odds Ratio (OR) Odds Ratio The ratio of dark to light shading is 3.37 times larger for the cervical cancer group than it is for the control group.

e)Even though it is inappropriate to do so calculate P(disease|risk status). P(case|Age<25) = 42/245 =.171 or 17.1% P(case|Age>25) = 7/121 =.058 or 5.8% Now calculate the odds for disease given the risk factor status Odds for Disease for 1 st Preg. Age < 25 =.171/( ) =.207 Odds for Disease for 1 st Preg. Age > 25 =.058/( ) =.061 Relative Risk (RR) and Odds Ratio (OR)

f) Finally calculate the odds ratio for disease associated with 1 st pregnancy age < 25 years of age. Odds Ratio =.207/.061 = 3.37 This is exactly the same as the odds ratio for having the risk factor (Age < 25) associated with being in the cervical cancer group!!!! Relative Risk (RR) and Odds Ratio (OR) Final Conclusion: Women who have their first child at or before age 25 have 3.37 times the odds of developing cervical cancer when compared to women who had their first child after the age of 25.

Relative Risk (RR) and Odds Ratio (OR) Risk Factor Status CaseControl Risk Factor Present a b Risk Factor Absent cd Disease Status OR = _____ a X d b X c Much easier computational formula!!!

Relative Risk (RR) and Odd’s Ratio (OR) When the disease is fairly rare, i.e. P(disease) <.10 or 10%, then one can show that the odds ratio and relative risk are similar. OR is approximately equal to RR when P(disease) <.10 or 10% chance. In these cases we can use the phrase: “… times more likely” when interpreting the OR.

Relative Risk (RR) and Odds Ratio (OR) Age at 1 st Pregnancy CaseControl Row Totals Age < 25 a 42 b Age > 25 c7c7 d Column Totals49317 n = 366 OR = (42 X 114)/(7 X 203) = 3.37 Because less than 10% of the population of women develop cervical cancer we can say women who have their first child at or before age 25 are 3.37 times more likely to develop cervical cancer than women who have their first child after age 25.

More About RR and OR The most commonly cited advantage of the RR over the OR is that the former is the more natural interpretation. The relative risk comes closer to what most people think of when they compare the relative likelihood of events. e.g. suppose there are two groups, one with a 25% chance of mortality and the other with a 50% chance of mortality. Most people would say that the latter group has it twice as bad. But the odds ratio is 3, which seems too big. RR =.50/.25 = 2.00 OR = P(death|high mortality)/P(survive|high mortality) P(death|low mortality)/P(survive|low mortality) =.50/(1 -.50) = /(1 -.25)

More About RR and OR Even more extreme examples are possible. A change from 25% to 75% mortality represents a relative risk of 3, but an odds ratio of 9. A change from 10% to 90% mortality represents a relative risk of 9 but an odds ratio of 81.

More About RR and OR OR’s arise as part of logistic regression which we will study later in the course. Despite their pitfalls OR’s are really the only option when case-control studies are used. Any study of risk needs to adjust for potential confounding factors which is typically done using logistic regression.