Unit 2 Exploring Data: Comparisons and Relationships Topic 7 Comparing Distributions II: Categorical Variables (page 137)

Slides:



Advertisements
Similar presentations
Data Analysis for Two-Way Tables
Advertisements

Basic Statistics The Chi Square Test of Independence.
Copyright ©2005 Brooks/Cole, a division of Thomson Learning, Inc. Relationships Between Categorical Variables Chapter 12.
Comparitive Graphs.
AP Statistics Section 4.2 Relationships Between Categorical Variables.
Exploring Two Categorical Variables: Contingency Tables
2.4 Cautions about Correlation and Regression. Residuals (again!) Recall our discussion about residuals- what is a residual? The idea for line of best.
Chapter 2: Looking at Data - Relationships /true-fact-the-lack-of-pirates-is-causing-global-warming/
CHAPTER 1 Exploring Data 1.1 Analyzing Categorical Data.
BPSChapter 61 Two-Way Tables. BPSChapter 62 To study associations between quantitative variables  correlation & regression (Ch 4 & Ch 5) To study associations.
Stat 217 – Day 6 Two-way Tables (Topic 6) Preliminary Questions 1-3 Calculators!
Ch 2 and 9.1 Relationships Between 2 Variables
Categorical Variables, Relative Risk, Odds Ratios STA 220 – Lecture #8 1.
1 Chapter 5 Two-Way Tables Associations Between Categorical Variables.
AP STATISTICS Section 4.2 Relationships between Categorical Variables.
+ The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE Chapter 1: Exploring Data Introduction Data Analysis: Making Sense of Data.
Chapter 3 concepts/objectives Define and describe density curves Measure position using percentiles Measure position using z-scores Describe Normal distributions.
Warm-Up List all of the different types of graphs you can remember from previous years:
Chapters 1 and 2 Week 1, Monday. Chapter 1: Stats Starts Here What is Statistics? “Statistics is a way of reasoning, along with a collection of tools.
4.3 Categorical Data Relationships.
1 Chapter 4: More on Two-Variable Data 4.1Transforming Relationships 4.2Cautions 4.3Relations in Categorical Data.
HW#8: Chapter 2.5 page Complete three questions on the last two slides.
Exploring Data Section 1.1 Analyzing Categorical Data.
Statistics: Analyzing 2 Categorical Variables MIDDLE SCHOOL LEVEL  Session #1  Presented by: Dr. Del Ferster.
Analysis of Two-Way tables Ch 9
Unit 3 Relations in Categorical Data. Looking at Categorical Data Grouping values of quantitative data into specific classes We use counts or percents.
CHAPTER 6: Two-Way Tables. Chapter 6 Concepts 2  Two-Way Tables  Row and Column Variables  Marginal Distributions  Conditional Distributions  Simpson’s.
Data Analysis for Two-Way Tables. The Basics Two-way table of counts Organizes data about 2 categorical variables Row variables run across the table Column.
1 Chapter 4: More on Two-Variable Data 4.1Transforming Relationships 4.2Cautions 4.3Relations in Categorical Data.
Warm Up The number of motor vehicles registered (in millions) in the U.S. has grown as charted in the table. 1)Plot the number of vehicles against time.
+ Chapter 1: Exploring Data Section 1.1 Analyzing Categorical Data The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE Statistics.
Two-way tables BPS chapter 6 © 2006 W. H. Freeman and Company.
Analysis of two-way tables - Data analysis for two-way tables IPS chapter 2.6 © 2006 W.H. Freeman and Company.
 Some variables are inherently categorical, for example:  Sex  Race  Occupation  Other categorical variables are created by grouping values of a.
Chapter 3: Displaying and Describing Categorical Data Sarah Lovelace and Alison Vicary Period 2.
BPS - 3rd Ed. Chapter 61 Two-Way Tables. BPS - 3rd Ed. Chapter 62 u In this chapter we will study the relationship between two categorical variables (variables.
Stat1510: Statistical Thinking and Concepts Two Way Tables.
Two-Way Tables Categorical Data. Chapter 4 1.  In this chapter we will study the relationship between two categorical variables (variables whose values.
Aim: How do we analyze data with a two-way table?
Warm-up An investigator wants to study the effectiveness of two surgical procedures to correct near-sightedness: Procedure A uses cuts from a scalpel and.
DO NOW: Oatmeal and cholesterol Does eating oatmeal reduce cholesterol
Chapter 6 Two-Way Tables BPS - 5th Ed.Chapter 61.
The TITANIC In 1912 the luxury liner Titanic, on its first voyage across the Atlantic, struck an iceberg and sank. Some passengers got off the ship in.
1.1 Analyzing Categorical Data Pages 7-24 Objectives SWBAT: 1)Display categorical data with a bar graph. Decide if it would be appropriate to make a pie.
BPS - 3rd Ed. Chapter 61 Two-Way Tables. BPS - 3rd Ed. Chapter 62 u In prior chapters we studied the relationship between two quantitative variables with.
AP Statistics Section 4.2 Relationships Between Categorical Variables
4.3 Relations in Categorical Data.  Use categorical data to calculate marginal and conditional proportions  Understand Simpson’s Paradox in context.
+ Warm Up Which of these variables are categorical? Which are quantitative?
Chapter 1.1 – Analyzing Categorical Data A categorical variable places individuals into one of several groups of categories. A quantitative variable takes.
CHAPTER 6: Two-Way Tables*
+ Chapter 1: Exploring Data Section 1.1 Analyzing Categorical Data The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE.
4.3 Reading Quiz (second half) 1. In a two way table when looking at education given a person is 55+ we refer to it as ____________ distribution. 2. True.
+ Chapter 1: Exploring Data Section 1.1 Analyzing Categorical Data.
+ Chapter 1: Exploring Data Section 1.1 Analyzing Categorical Data The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE.
Unit 1 Exploring Data: Distributions
Second factor: education
CHAPTER 1 Exploring Data
AP Statistics Chapter 3 Part 3
Displaying and Describing Categorical Data
Analysis of two-way tables - Data analysis for two-way tables
Second factor: education
Quick Check In 2002, Consumer Reports published an article evaluating refrigerators. It listed 41 models, giving the brand, cost, size (cu ft), type.
Chapter 2 Looking at Data— Relationships
AP STATISTICS LESSON 4 – 3 ( DAY 1 )
CHAPTER 6: Two-Way Tables
Second factor: education
Warmup Which part- time jobs employed 10 or more of the students?
Chapter 2 Looking at Data— Relationships
Section 4-3 Relations in Categorical Data
Chapter 1: Exploring Data
Presentation transcript:

Unit 2 Exploring Data: Comparisons and Relationships Topic 7 Comparing Distributions II: Categorical Variables (page 137)

OVERVIEW In the previous topic you encountered the notion of a statistical tendency and studied techniques for comparing distributions of quantitative variables. In this topic you will study some basic techniques for comparing distributions of categorical variables.

PK Note: This topic is on categorical variables, therefore no lists need to be downloaded. Remember that a categorical variable is one that records simply that category into which a person or thing falls on the characteristic in question.

OVERVIEW These techniques involve the analysis of two-way tables of counts. You will use no more complicated mathematical operations than addition and calculation of proportions, but you will acquire some very powerful analytical tools.

Do the Preliminaries (page 138 & 139) Definitions

#10 - Political Views Liberal – open to new behavior or opinions and willing to discard traditional values. Moderate – not radical or excessively right or left wing. Conservative – holding to traditional attitudes and values and cautious about change or innovation.

Essential Question How is a two-way table used to summarize information from a pair of categorical variables? How can you compare and contrast categorical variables from two or more groups?

Activity 7-1: Suitability for Politics (pages 139 & 140) (a)political inclination :_____________________________ suitability opinion :_____________________________ categorical categorical - binary

(a)political inclination:____________________ suitability opinion:____________________ Often we are interested in considering one variable, the ____________________________ variable as being affected or predicted by the other variable, the _______________________________ variable. categorical categorical - binary response (dependent or y) explanatory (independent or x)

(b)explanatory variable:______________________________ response variable:______________________________ (c) This table is called a two-way table since it classifies each person according to two variables. The explanatory variable should be in columns and the response variable should be in rows. political inclination suitability opinion liberalmoderateconservative agree with statement disagree with statement

Essential Questions How can a segmented bar graph be used to represent information in two-way tables? How can you compare and contrast categorical variables from two or more groups?

Activity 7-2: Age and Political Interest (pages 140 to 144) A two-way table classifies each case (or observational unit) according to 2 variables.

a)_________ of the survey respondents were between ages 18 and 35 years old. b)_________ of the survey respondents were between ages 36 and 55 years old. c)_________ of the survey respondents were over 55 years old. 30% 385 ÷ 1265 = … 42% 531 ÷ 1265 = … 28% 349 ÷ 1265 = …

Marginal distributions are calculated for one variable. We used bar graphs to represent marginal distributions graphically. To study possible relationships between two categorical variables, one examines conditional distributions. Conditional distributions are distributions of one variable for given categories of the other variable.

d)_________ of the young respondents classify themselves as not much interest in politics. e)_________ of the young respondents classify themselves as somewhat interested in politics. f)_________ of the young respondents classify themselves as very much interest in politics. g) 38% 146 ÷ 385 = … 50% 192 ÷ 385 = … 12% 47 ÷ 385 = … not much somewhat very much total

(h) 100% - 90% -- very much 80% - 70% -- somewhat 60% - 50% -- not much 40% - 30% - 20% - 10% - 0% age category Conditional distributions can be represented visually with segmented bar graphs.

100% - 90% -- very much 80% - 70% -- somewhat 60% - 50% -- not much 40% - 30% - 20% - 10% - 0% age category (i) Does the distribution of political interest seem to differ among the 3 age groups? ______ If so, describe the key features … YES The younger age group does not have much interest in politics, but the older group tends to have very much interest in politics.

j) ________ of respondents aged classify themselves as not much interest in politics. k)_________ of those with not much interest in politics are of age l)_________ of the 1265 respondents identified themselves as being both between ages and having not much political interest. When dealing with conditional proportions, it is very important to keep straight which category is the one being conditioned on. 27% 146 ÷ 531 = … 38% 146 ÷ 381 = … 12% 146 ÷ 1265 = …

Assignment Activity 7-8: Gender-Stereotypical Toy Advertising (page 151) How can you compare and contrast categorical variables from two or more groups?

Essential Question What is the concept of relative risk? How can you compare and contrast categorical variables from two or more groups?

Activity 7-3: Pregnancy, AZT, & HIV (pages 144 & 145) a)_________ is my estimate for AZT-receiving women with HIV-positive babies. _________ is my estimate for placebo-receiving women with HIV-positive babies. ? ?

b)_________ is the actual proportion for AZT-receiving women with HIV-positive babies. _________ is the actual proportion for placebo- receiving women with HIV-positive babies. c)The proportion of HIV-positive babies among placebo- receiving mothers is ________ times greater the proportion of HIV-positive babies among AZT- receiving mothers. 8% 25% 13 ÷ 164 = … 40 ÷ 160 = ÷ 0.08 =

You have calculated the relative risk of having an HIV-positive baby between the AZT and placebo groups. If the response variable categories are incidence and non-incidence of a disease, then the relative risk is the ratio of the proportions having the disease between the two groups of the explanatory variable. Definition: in ⋅ ci ⋅ dence [in-si-duhns] noun the rate or range of occurrence or influence of something, especially of something unwanted … the high incidence of heart disease in men over 40.

d)[comment] The difference between the 2 groups appears to be important. Based on these results it appears that the drug AZT should be used for HIV positive pregnant women. Please note that the placebo does nothing. It is a “ sugar pill ”. It ’ s use is mental!

Assignment Activity 7-13: Driver Safety (continued) (page 153) Show car accident PowerPoint! How can you compare and contrast categorical variables from two or more groups?

Essential Question What is Simpson’s Paradox? How can you compare and contrast categorical variables from two or more groups?

Activity 7-4: Hypothetical Hospital Recovery Rates (pages 146 & 147) (a)______ of Hospital A ’ s patients survived. ______ of Hospital B ’ s patients survived. Hospital ____ saved the higher percentage of its patients. (b)Are you convinced? ______ (Check the totals first.) 80% 90% B Yes

(c)Patients in FAIR condition: _________ of Hospital A ’ s patients survived. _________ of Hospital B ’ s patients survived. Hospital ___ saved the greater percentage of its patients. (d) Patients in POOR condition: _________ of Hospital A ’ s patients survived. _________ of Hospital B ’ s patients survived. Hospital ___ saved the greater percentage of its patients. 98% 97% A 53% 30% A

This phenomenon is called Simpson’s paradox. This refers to the fact that aggregate (whole) proportions can reverse the direction of the relationship seen in the individual pieces.

it has a higher survival rate regardless of one ’ s condition. Hospital A treats more patients in “ poor ” condition. These patients are more likely to die than those in “ fair ” condition. Therefore, Hospital A ’ s overall survival rate is lower than Hospital B ’ s rate despite being higher for each type of patient. A e)[comment] f) I would prefer to go to Hospital _____, because …

Assignment Activity 7-20: Graduate Admissions Discrimination (pages 155 & 156) Assignment Activity 7-21: Softball Batting Averages (pages 156 & 157) Quiz on Topic 7: Comparing Distributions II: Categorical Variables Activity 7-6: Suitability for Politics (continued) How can you compare and contrast categorical variables from two or more groups?

Essential Questions What is the concept of independence? How can you compare and contrast categorical variables from two or more groups?

Activity 7-5: Women Senators (pages 148 & 149) (a) RepublicansDemocratsrow total men 91 women 9 column total

Activity 7-5: Women Senators (pages 148 & 149) (a) RepublicansDemocratsrow total men 91 women 9 column total (b)Republicans have _______ men and _______ women. Democrats have _______ men and _______ women. The ______________ party has the higher proportion of women among its Senators. 95% 5% 87%13% Democratic

Republicans have ________ men and ________ women. Democrats have ________ men and ________ women. 95% 5% 87%13% 100% - 90% -- men 80% - 70% -- women 60% - 50% - 40% - 30% - 20% - 10% - 0% RepublicansDemocrats party

d)There were ________ Republicans than Democrats in the 1999 U.S. Senate. There were ________ total female Senators in the Democratic party than in the Republican party. Consequently, the Democrats had a ____________ proportion of women among their Senators. more higher

Two categorical variables are said to be independent if the conditional distributions of one variable are identical for every category of the other variable. e)The variables gender and party ____________ independent among members of the 1999 U.S. Senate because there is a ______________ in the gender breakdown between the two parties. are not difference

(f) RepublicansDemocratsrow total men 80 women 20 column total These numbers achieve the same gender percentages in both groups: Women: ________ and Men: ________ total women: 20 ÷ 100 =.20 and total men: 80 ÷ 100 =.80 ∴ Republican women:.20 x 60 = 12 and Republican men:.80 x 60 = 48 and Democrat women:.20 x 40 = 8 and Democrat men:.80 x 40 = 32 20% 80%

Women: ________ and Men: ________ 20%80% 100% - 90% -- men 80% - 70% -- women 60% - 50% - 40% - 30% - 20% - 10% - 0% RepublicansDemocrats party Independent Variables

WRAP-UP With this topic we have concluded our investigation of distributions of data. This topic has differed from earlier ones in that it has dealt exclusively with categorical variables. The most important technique that this topic has covered has involved interpreting information presented in two-way tables.

You have encountered the ideas of marginal distributions and conditional distributions, and you have learned to draw bar graphs and segmented bar graphs to display these distributions. You have explored the notion of relative risk, and you have discovered and explained the phenomenon known as Simpson’s paradox, which raises interesting issues with regard to analyzing two-way tables.

Comparing distributions of categorical variables can also be thought of as exploring relationships between those variables. The next unit will be devoted to exploring relationships between quantitative variables. You will find that our approach will again involve starting with graphical displays (scatterplots) and proceeding to numerical summaries (correlation). You will then study a technique for predicting one variable from the value of another (regression).

Assignment Activity 7-25: Politics and Ice Cream (pages 158 & 159) Prepare for Quiz on Topic 7: Comparing Distributions II: Categorical Variables Activity 7-18: Lifetime Achievements (continued) Assignment Activity 7-23: Hypothetical Employee Retention Predictions (pages 157 & 158) How can you compare and contrast categorical variables from two or more groups?

Your topic is due! Prepare for a Test on Topics 6 & 7 Quiz on Topic 7: Comparing Distributions II: Categorical Variables Activity 7-18: Lifetime Achievements (continued) How can you compare and contrast categorical variables from two or more groups?