Presentation is loading. Please wait.

Presentation is loading. Please wait.

Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 3- 1.

Similar presentations


Presentation on theme: "Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 3- 1."— Presentation transcript:

1 Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 3- 1

2 Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 3 Displaying and Describing Categorical Data

3 Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 3- 3 The Three Rules of Data Analysis The three rules of data analysis won’t be difficult to remember: 1. Make a picture—things may be revealed that are not obvious in the raw data. These will be things to think about. 2. Make a picture—important features of and patterns in the data will show up. You may also see things that you did not expect. 3. Make a picture—the best way to tell others about your data is with a well-chosen picture.

4 Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 3- 4 Frequency Tables: Making Piles We can “pile” the data by counting the number of data values in each category of interest. We can organize these counts into a frequency table, which records the totals and the category names.

5 Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 3- 5 Frequency Tables: Making Piles (cont.) A relative frequency table is similar, but gives the percentages (instead of counts) for each category.

6 Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 3- 6 What’s Wrong With This Picture? You might think that a good way to show the Titanic data is with this display:

7 Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 3- 7 The Area Principle The ship display makes it look like most of the people on the Titanic were crew members, with a few passengers along for the ride. When we look at each ship, we see the area taken up by the ship, instead of the length of the ship. The ship display violates the area principle: The area occupied by a part of the graph should correspond to the magnitude of the value it represents.

8 Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 3- 8 Bar Charts A bar chart displays the distribution of a categorical variable, showing the counts for each category next to each other for easy comparison. A bar chart stays true to the area principle. Thus, a better display for the ship data is:

9 Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 3- 9 Bar Charts (cont.) A relative frequency bar chart displays the relative proportion of counts for each category. A relative frequency bar chart also stays true to the area principle. Replacing counts with percentages in the ship data:

10 Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 3- 10 Bar Charts (cont.) Used for categorical data Bars do not touch Categorical variable is typically on the horizontal axis To describe – comment on which occurred the most often or least often May make a double bar graph or segmented bar graph for bivariate categorical data sets

11 Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 3- 11 Example: Creating Bar Graphs 1. Graph gender. 2. Graph birth month for the class. Remember, bars do not touch!!

12 Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 3- 12 When you are interested in parts of the whole, a pie chart might be your display of choice. Pie charts show the whole group of cases as a circle. They slice the circle into pieces whose size is proportional to the fraction of the whole in each category. Pie Charts

13 Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 3- 13 Creating Pie Charts Used for categorical data To make: Proportion x 360° Using a protractor, mark off each part To describe – comment on which occurred the most often or least often

14 Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 3- 14 Example - Creating Pie Charts The table below shows a summary of all data collected from the BHS stats classes about political affiliation. Create a pie chart to display this data. Describe the resulting pie chart. Political Affiliation Count Democrat Republican Independent

15 Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 3- 15 Contingency Tables A contingency table allows us to look at two categorical variables together. It shows how individuals are distributed along each variable, contingent on the value of the other variable. Example: we can examine the class of ticket and whether a person survived the Titanic:

16 Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 3- 16 Contingency Tables (cont.) The margins of the table, both on the right and on the bottom, give totals and the frequency distributions for each of the variables. Each frequency distribution is called a marginal distribution of its respective variable. The marginal distribution of Survival is:

17 Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 3- 17 Contingency Tables (cont.) Each cell of the table gives the count for a combination of values of the two values. For example, the second cell in the crew column tells us that 673 crew members died when the Titanic sunk.

18 Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 3- 18 Conditional Distributions A conditional distribution shows the distribution of one variable for just the individuals who satisfy some condition on another variable. The following is the conditional distribution of ticket Class, conditional on having survived:

19 Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 3- 19 Conditional Distributions (cont.) The following is the conditional distribution of ticket Class, conditional on having perished:

20 Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 3- 20 Conditional Distributions (cont.) The conditional distributions tell us that there is a difference in class for those who survived and those who perished. This is better shown with pie charts of the two distributions:

21 Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 3- 21 Conditional Distributions (cont.) We see that the distribution of Class for the survivors is different from that of the nonsurvivors. This leads us to believe that Class and Survival are associated, that they are not independent. The variables would be considered independent when the distribution of one variable in a contingency table is the same for all categories of the other variable.

22 Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 3- 22 Segmented Bar Charts A segmented bar chart displays the same information as a pie chart, but in the form of bars instead of circles. Here is the segmented bar chart for ticket Class by Survival status:

23 Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 3- 23 Don’t violate the area principle. While some people might like the pie chart on the left better, it is harder to compare fractions of the whole, which a well-done pie chart does. What Can Go Wrong?

24 Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 3- 24 What Can Go Wrong? (cont.) Keep it honest—make sure your display shows what it says it shows. This plot of the percentage of high-school students who engage in specified dangerous behaviors has a problem. Can you see it?

25 Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 3- 25 What Can Go Wrong? (cont.) Don’t confuse similar-sounding percentages— pay particular attention to the wording of the context. Don’t forget to look at the variables separately too—examine the marginal distributions, since it is important to know how many cases are in each category.

26 Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 3- 26 What Can Go Wrong? (cont.) Be sure to use enough individuals! Do not make a report like “We found that 66.67 of the rats improved their performance with training. The other rat died.”

27 Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 3- 27 What Can Go Wrong? (cont.) Don’t overstate your case—don’t claim something you can’t. Don’t use unfair or silly averages—this could lead to Simpson’s Paradox, so be careful when you average one variable across different levels of a second variable.

28 Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 3- 28 Example, pg. 39 #22 A survey of autos parked in student and staff lots at a large university classified the brands by country of origin as seen in the table. a. What percent of all the cars surveyed were foreign? b. What percent of the American cars were owned by students? c. What percent of the students owned American cars? d. What is the marginal distribution of origin? e. What are the conditional distributions of origin by driver classification? f. Do you think that origin of the car is independent of the type of driver? Explain. StudentStaff American107105 European3312 Asian5547

29 Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 3- 29 Example, pg. 39 #22 a. What percent of all the cars surveyed were foreign? 147/359 = 40.95% b. What percent of the American cars were owned by students? 107/212 = 50.47% c. What percent of the students owned American cars? 107/195 = 54.87% d. What is the marginal distribution of origin? OriginTotalsPercentages American21259.05% European4512.53% Asian10228.41%

30 Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 3- 30 Example, pg. 39 #22 OriginStaffPercentage American10564.02% European127.32% Asian4728.66% Totals164100.00% d. What are the conditional distributions of origin by driver classification? e. Do you think that origin of the car is independent of the type of driver? Explain. No. The marginal distributions look slightly different. A bar or pie chart could be used to compare. OriginStudentPercentage American10754.87% European3316.92% Asian5528.21% Totals195100.00%

31 Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 3- 31 Example pg. 40 #26 A study by the University of Texas Southwestern Medical Center examined 626 people to see if there was an increased risk of contracting hepatitis C associated with having a tattoo. If the subject had a tattoo, researchers asked whether it had been done in a commercial tattoo parlor or elsewhere. Write a brief description of the association between tattooing and hepatitis C, including an appropriate graphical display.

32 Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 3- 32 Example pg. 40 #26 Tattoo – commercial Tattoo - elsewhere No tattoo Has Hepatitis C17818 No Hepatitis C3553495 Think – Show – Tell Create marginal distribution table. What kind of graph would be best?

33 Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 3- 33 Example pg. 40 #26 There seems to be a strong association between having a tattoo done and contracting hepatitis C. There appears to be a much greater risk for tattoos done in commercial parlors, less for tattoos done elsewhere, and very little risk for those with no tattoos.

34 Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 3- 34 What have we learned? We can summarize categorical data by counting the number of cases in each category (expressing these as counts or percents). We can display the distribution in a bar chart or pie chart. And, we can examine two-way tables called contingency tables, examining marginal and/or conditional distributions of the variables.


Download ppt "Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 3- 1."

Similar presentations


Ads by Google