Relationships Between Categorical Variables Thought Questions 1. Suppose a news article claimed that drinking coffee doubled your risk of developing a.

Slides:



Advertisements
Similar presentations
How would you explain the smoking paradox. Smokers fair better after an infarction in hospital than non-smokers. This apparently disagrees with the view.
Advertisements

Copyright ©2005 Brooks/Cole, a division of Thomson Learning, Inc. Statistical Significance for 2 x 2 Tables Chapter 13.
Three or more categorical variables
Comparing Two Proportions (p1 vs. p2)
M2 Medical Epidemiology
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. Relationships Between Categorical Variables Chapter 6.
Copyright ©2005 Brooks/Cole, a division of Thomson Learning, Inc. Relationships Between Categorical Variables Chapter 12.
Section 1.3 Experimental Design © 2012 Pearson Education, Inc. All rights reserved. 1 of 61.
Section 1.3 Experimental Design.
CHAPTER 23: Two Categorical Variables The Chi-Square Test ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture.
AP Statistics Section 4.2 Relationships Between Categorical Variables.
Chapter 4: Designing Studies
Chapter 13: Inference for Distributions of Categorical Data
Not in FPP Exploratory data analysis with two qualitative variables 1.
Analysis of frequency counts with Chi square
What type of study is this?
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 3 Association: Contingency, Correlation, and Regression Section 3.4 Cautions in Analyzing.
Extension Article by Dr Tim Kenny
Copyright ©2011 Brooks/Cole, Cengage Learning More about Inference for Categorical Variables Chapter 15 1.
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Categorical Variables Chapter 15.
We’re ready to TEST our Research Questions! In science, how do we usually test a hypothesis?
Chapter 2: Looking at Data - Relationships /true-fact-the-lack-of-pirates-is-causing-global-warming/
Risk and Relative Risk. Suppose a news article claimed that drinking coffee doubled your risk of developing a certain disease. Assume the statistic was.
Statistics 303 Chapter 9 Two-Way Tables. Relationships Between Two Categorical Variables Relationships between two categorical variables –Depending on.
Journal Club Alcohol, Other Drugs, and Health: Current Evidence January–February 2011.
Statistical Thinking Experiments in the Real World
1 Chapter 20 Two Categorical Variables: The Chi-Square Test.
Categorical Variables, Relative Risk, Odds Ratios STA 220 – Lecture #8 1.
Multiple Choice Questions for discussion
8.1 Inference for a Single Proportion
Chapter 171 Thinking About Chance. Chapter 172 Thought Question 1 Here are two very different probability questions: If you roll a 6-sided die and do.
Experimental Design 1 Section 1.3. Section 1.3 Objectives 2 Discuss how to design a statistical study Discuss data collection techniques Discuss how to.
Please turn off cell phones, pagers, etc. The lecture will begin shortly.
Measures of Association
Objective: Use experimental methods to design our own experiments Homework: Read section 6.1 and 6.2 (for vocab and examples) and then complete exercises:
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. Relationships Between Categorical Variables Chapter 6.
Introductory Statistics Week 4 Lecture slides Exploring Time Series –CAST chapter 4 Relationships between Categorical Variables –Text sections.
Copyright ©2005 Brooks/Cole, a division of Thomson Learning, Inc. Relationships Between Categorical Variables Chapter 12.
Chapter 2 Notes Math Math 1680 Assignments Look over Chapter 1 and 2 before Wednesday Assignment #2: Chapter 2 Exercise Set A (all, but #7, 8, and.
R Programming Risk & Relative Risk 1. Session 2 Overview 1.Risk 2.Relative Risk 3.Percent Increase/Decrease Risk 2.
+ Chi Square Test Homogeneity or Independence( Association)
BPS - 5th Ed. Chapter 221 Two Categorical Variables: The Chi-Square Test.
Please turn off cell phones, pagers, etc. The lecture will begin shortly.
Statistical Significance for a two-way table Inference for a two-way table We often gather data and arrange them in a two-way table to see if two categorical.
Lecture: Forensic Evidence and Probability Characteristics of evidence Class characteristics Individual characteristics  features that place the item.
1 Confidence Intervals for Two Proportions Section 6.1.
1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?
Feb. 13 Chapter 12, Try 1-9 Read Ch. 15 for next Monday No meeting Friday.
DO NOW: Oatmeal and cholesterol Does eating oatmeal reduce cholesterol
Measures of Disease Frequency
CHAPTER 9 Producing Data: Experiments BPS - 5TH ED.CHAPTER 9 1.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 10 Comparing Two Groups Section 10.1 Categorical Response: Comparing Two Proportions.
Section 1.3 Experimental Design.
Copyright ©2011 Brooks/Cole, Cengage Learning Relationships Between Categorical Variables – Risk Class 26 1.
Copyright © 2016 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. D ESIGN OF E XPERIMENTS Section 1.3.
Statistics 100 Lecture Set 4. Lecture Set 4 Chapter 5 and 6 … please read Read Chapter 7 … you are responsible for all of this chapter Some suggested.
Experiments Textbook 4.2. Observational Study vs. Experiment Observational Studies observes individuals and measures variables of interest, but does not.
Review Design of experiments, histograms, average and standard deviation, normal approximation, measurement error, and probability.
Transportation-related Injuries among US Immigrants: Findings from National Health Interview Survey.
Chapter 2. **The frequency distribution is a table which displays how many people fall into each category of a variable such as age, income level, or.
Unit 2 Exploring Data: Comparisons and Relationships Topic 7 Comparing Distributions II: Categorical Variables (page 137)
Copyright ©2011 Brooks/Cole, Cengage Learning Relationships Between Categorical Variables – Simpson’s Paradox Class 27 1.
Copyright ©2005 Brooks/Cole, a division of Thomson Learning, Inc. Statistical Significance for 2 x 2 Tables Chapter 13.
Lecture #8 Thursday, September 15, 2016 Textbook: Section 4.4
CHAPTER 4 Designing Studies
Statistics 200 Lecture #7 Tuesday, September 13, 2016
Objective: Use experimental methods to design our own experiments
Chapter 10 Analyzing the Association Between Categorical Variables
Evaluating Effect Measure Modification
Presentation transcript:

Relationships Between Categorical Variables Thought Questions 1. Suppose a news article claimed that drinking coffee doubled your risk of developing a certain disease. Assume the statistic was based on legitimate, well-conducted research. What additional information would you want about the risk before deciding whether to quit drinking coffee? (Hint: Does this statistic provide any information on your actual risk?) 2. A recent study estimated that the “relative risk” of a woman developing lung cancer if she smoked was What do you think is meant by the term relative risk?

Relationships Between Categorical Variables Thought Questions 3. A study classified pregnant women according to whether they smoked and whether they were able to get pregnant during the first cycle in which they tried to do so. What do you think is the question of interest? Attempt to answer it. Here are the results:

Relationships Between Categorical Variables Displaying Relationships Between Categorical Variables: Contingency Tables Count the number of individuals who fall into each combination of categories. Present counts in table = contingency table. Each row and column combination = cell. Row = explanatory variable. Column = response variable. Example 1: Aspirin and Heart Attacks Variable A = explanatory variable = aspirin or placebo Variable B = response variable = heart attack or no heart attack Contingency Table with explanatory as row variable, response as column variable, four cells.

Relationships Between Categorical Variables Conditional Percentages and Rates Example: Find the Conditional (Row) Percentages Aspirin Group: Percentage who had heart attacks = 104/11,037 = or 0.94% Placebo Group: Percentage who had heart attacks = 189/11,034 = or 1.71% Rate: the number of individuals per 1000 or per 10,000 or per 100,000. Percentage: rate per 100 Example : Percentage and Rate Added

Relationships Between Categorical Variables Example: Ease of Pregnancy for Smokers and Nonsmokers- Retrospective Observational Study Variable A = explanatory variable = smoker or nonsmoker Variable B = response variable = pregnant in first cycle or not Time to Pregnancy for Smokers and Nonsmokers Much higher percentage of nonsmokers than smokers were able to get pregnant during first cycle, but we cannot conclude that smoking caused a delay in getting pregnant.

Relationships Between Categorical Variables Relative Risk, Increased Risk, and Odds Forty percent (40%) of all individuals carry the gene. The proportion who carry the gene is The probability that someone carries the gene is.40. The risk of carrying the gene is The odds of carrying the gene are 4 to 6 (or 2 to 3, or 2/3 to 1). A population contains 1000 individuals, of which 400 carry the gene for a disease. Equivalent ways to express this proportion: Risk, Probability, and Odds Percentage with trait = (number with trait/total)×100% Proportion with trait = number with trait/total Probability of having trait = number with trait/total Risk of having trait = number with trait/total Odds of having trait = (number with trait/number without trait) to 1

Relationships Between Categorical Variables Baseline Risk and Relative Risk Baseline Risk: risk without treatment or behavior Can be difficult to find. Example: Risk of getting lung cancer if you don’t smoke. If placebo included, baseline risk = risk for placebo group. Relative Risk: of outcome for two categories of explanatory variable is ratio of risks for each category. Relative risk of 3: risk of developing disease for one group is 3 times what it is for another group. Relative risk of 1: risk is same for both categories of the explanatory variable (or both groups).

Relationships Between Categorical Variables Swedish Study: Effectiveness of Population-Based Service Screening With Mammography for Women Ages 40 to 49 Years RESULTS: During the study period, there were 803 breast cancer deaths in the study group (7.3 million person-years) and 1238 breast cancer deaths in the control group(8.8 million person-years). The estimated RR ( crude ) for women aged was 0.79 (95% CI, ) Study Group: 803/ 7,261,415 = Control Group: 1238/8,843,852 = Relative Risk = / = 0.79 Relative Risk = / = 1.27 Breast Cancer Death Rates : Study Group vs Control Group 11 per 100,000 person-years versus 14 per 100,000 person-years Person-Years Definition: The product of the number of years times the number of members of a population who have been affected by a certain condition (years of treatment with a given drug).

Relationships Between Categorical Variables Example : Relative Risk of Developing Breast Cancer Risk of breast cancer for women having first child at 25 or older = 31/1628 = Risk of breast cancer for women having first child before 25 = 65/4540 = Relative risk = / = 1.33 What doe this RR mean? Increased Risk = (change in risk/baseline risk)×100% Baseline risk for those who had child before age 25 = Risk for women having first child at 25 or older = Change in risk = ( – ) = Increased risk = (0.0047/0.0143) = or 32.9% What doe this increased risk mean?

Relationships Between Categorical Variables Odds Ratio Odds Ratio: ratio of the odds of getting the disease to the odds of not getting the disease. If the risk is small, about the same as the Relative Risk. Odds for women having first child at age 25 or older = with breast cancer/without breast cancer= 31/1597 = Odds for women having first child before age 25 = 65/4475 = Odds ratio = / = 1.34 Example: Odds Ratio for Breast Cancer

Relationships Between Categorical Variables STUDY: ASSOCIATION BETWEEN CELLULAR-TELEPHONE CALLS AND MOTOR VEHICLE COLLISIONS Background Because of a belief that the use of cellular telephones while driving may cause collisions, several countries have restricted their use in motor vehicles, and others are considering such regulations. Methods The study was conducted in Toronto, an urban region with no regulations against using a cellular telephone while driving. Persons who came to the North York Collision Reporting Centre between July 1, 1994, and August 31, 1995, during peak hours (10 a.m. to 6 p.m.) on Monday through Friday were included in the study if they had been in a collision with substantial property damage (as judged by the police). We studied 699 drivers who had cellular telephones and who were involved in motor vehicle collisions resulting in substantial property damage but no personal injury. Each person’s cellular-telephone calls on the day of the collision and during the previous week were analyzed through the use of detailed billing records.

Relationships Between Categorical Variables STUDY: ASSOCIATION BETWEEN CELLULAR-TELEPHONE CALLS AND MOTOR VEHICLE COLLISIONS Time of the Motor Vehicle Collision The time of each collision was estimated from the subject’s statement, police records, and telephone listings of calls to emergency services. Analytic Method We used case–crossover analysis, a technique for assessing the brief change in risk associated with a transient exposure. According to this method, each person serves as his or her own control; confounding due to age, sex, visual acuity, training, personality, driving record, and other fixed characteristics is thereby eliminated. We used the pair-matched analytic approach to contrast a time period on the day of the collision with a comparable period on a day preceding the collision. In this instance, case–crossover analysis would identify an increase in risk if there were more telephone calls immediately before the collision than would be expected solely as a result of chance.

Relationships Between Categorical Variables Results A total of 26,798 cellular-telephone calls were made during the 14-month study period. The risk of a collision when using a cellular telephone was four times higher than the risk when a cellular telephone was not being used (relative risk, 4.3; 95 percent confidence interval, 3.0 to 6.5). The relative risk was similar for drivers who differed in personal characteristics such as age and driving experience; calls close to the time of the collision were particularly hazardous (relative risk, 4.8 for calls placed within 5 minutes of the collision, as compared with 1.3 for calls placed more than 15 minutes before the collision; P = 0.001); Units that allowed the hands to be free (relative risk, 5.9) offered no safety advantage over hand- held units (relative risk, 3.9). STUDY: ASSOCIATION BETWEEN CELLULAR-TELEPHONE CALLS AND MOTOR VEHICLE COLLISIONS

Relationships Between Categorical Variables STUDY: ASSOCIATION BETWEEN CELLULAR-TELEPHONE CALLS AND MOTOR VEHICLE COLLISIONS Weaknesses of the Study studied only drivers who consented to participate. people vary in their driving behavior from day to day — a fact that makes the selection of a control period problematic. case–crossover analysis does not eliminate all forms of confounding. Imbalances in some temporary conditions related to the driver, the vehicle, or the environment are possible.

Relationships Between Categorical Variables Misleading Statistics about Risk Common ways the media misrepresent statistics about risk: 1.The baseline risk is missing. 2.The time period of the risk is not identified. 3.The reported risk is not necessarily your risk. Missing Baseline Risk Reported men who drank 500 ounces or more of beer a month (about 16 ounces a day) were three times more likely to develop cancer of the rectum than nondrinkers. “Evidence of new cancer-beer connection” Sacramento Bee, March 8, 1984

Relationships Between Categorical Variables Risk over What Time Period? If 1 in 9 women get breast cancer, does it mean if a women eats above diet, chances of breast cancer are 1 in 3? What other information do we need to know? “Italian scientists report that a diet rich in animal protein and fat—cheeseburgers, french fries, and ice cream, for example—increases a woman’s risk of breast cancer threefold,” Prevention Magazine’s Giant Book of Health Facts (1991, p. 122) Reported Risk versus Your Risk What factors determine which cars are stolen? Reported among the 20 most popular auto models stolen [in California] last year, 17 were at least 10 years old.” “Older cars stolen more often than new ones” Davis (CA) Enterprise, 15 April 1994

Relationships Between Categorical Variables Simpson’s Paradox: The Missing Third Variable Can be dangerous to summarize information over groups. Example: Simpson’s Paradox for Hospital Patients Survival Rates for Standard and New Treatments Risk Compared for Standard and New Treatments Looks like new treatment is a success at both hospitals, especially at Hospital B. More serious cases were treated at Hospital A (famous research hospital); More serious cases were also more likely to die, no matter what. higher proportion of patients at Hospital A received the new treatment. Estimating the Overall Reduction in Risk