Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 1 Slide © 2006 Thomson/South-Western Chapter 2 Descriptive Statistics: Tabular and Graphical Presentations Part B n Exploratory Data Analysis n Crosstabulations.

Similar presentations


Presentation on theme: "1 1 Slide © 2006 Thomson/South-Western Chapter 2 Descriptive Statistics: Tabular and Graphical Presentations Part B n Exploratory Data Analysis n Crosstabulations."— Presentation transcript:

1 1 1 Slide © 2006 Thomson/South-Western Chapter 2 Descriptive Statistics: Tabular and Graphical Presentations Part B n Exploratory Data Analysis n Crosstabulations and Scatter Diagrams Scatter Diagrams x y

2 2 2 Slide © 2006 Thomson/South-Western Exploratory Data Analysis The techniques of exploratory data analysis consist of The techniques of exploratory data analysis consist of simple arithmetic and easy-to-draw pictures that can simple arithmetic and easy-to-draw pictures that can be used to summarize data quickly. be used to summarize data quickly. One such technique is the stem-and-leaf display. One such technique is the stem-and-leaf display.

3 3 3 Slide © 2006 Thomson/South-Western Stem-and-Leaf Display Each digit on a stem is a leaf. Each digit on a stem is a leaf. Each line in the display is referred to as a stem. Each line in the display is referred to as a stem. To the right of the vertical line we record the last To the right of the vertical line we record the last digit for each item in rank order. digit for each item in rank order. The first digits of each data item are arranged to the The first digits of each data item are arranged to the left of a vertical line. left of a vertical line. It is similar to a histogram on its side, but it has the It is similar to a histogram on its side, but it has the advantage of showing the actual data values. advantage of showing the actual data values. A stem-and-leaf display shows both the rank order A stem-and-leaf display shows both the rank order and shape of the distribution of the data. and shape of the distribution of the data.

4 4 4 Slide © 2006 Thomson/South-Western Example: Hudson Auto Repair The manager of Hudson Auto would like to have a better understanding of the cost of parts used in the engine tune-ups performed in the shop. She examines 50 customer invoices for tune-ups. The costs of parts, rounded to the nearest dollar, are listed on the next slide.

5 5 5 Slide © 2006 Thomson/South-Western Example: Hudson Auto Repair n Sample of Parts Cost for 50 Tune-ups

6 6 6 Slide © 2006 Thomson/South-Western Stem-and-Leaf Display 5 6 7 8 9 10 2 7 2 2 2 2 5 6 7 8 8 8 9 9 9 1 1 2 2 3 4 4 5 5 5 6 7 8 9 9 9 0 0 2 3 5 8 9 1 3 7 7 7 8 9 1 4 5 5 9 a stem a leaf

7 7 7 Slide © 2006 Thomson/South-Western Stretched Stem-and-Leaf Display Whenever a stem value is stated twice, the first value Whenever a stem value is stated twice, the first value corresponds to leaf values of 0  4, and the second corresponds to leaf values of 0  4, and the second value corresponds to leaf values of 5  9. value corresponds to leaf values of 5  9. If we believe the original stem-and-leaf display has If we believe the original stem-and-leaf display has condensed the data too much, we can stretch the condensed the data too much, we can stretch the display by using two stems for each leading digit(s). display by using two stems for each leading digit(s).

8 8 8 Slide © 2006 Thomson/South-Western Stretched Stem-and-Leaf Display 5 5 9 1 4 7 7 7 8 9 1 3 5 8 9 0 0 2 3 5 5 5 6 7 8 9 9 9 1 1 2 2 3 4 4 5 6 7 8 8 8 9 9 9 2 2 2 2 7 2 5 5 6 6 7 7 8 8 9 9 10 10

9 9 9 Slide © 2006 Thomson/South-Western Stem-and-Leaf Display n Leaf Units Where the leaf unit is not shown, it is assumed Where the leaf unit is not shown, it is assumed to equal 1. to equal 1. Leaf units may be 100, 10, 1, 0.1, and so on. Leaf units may be 100, 10, 1, 0.1, and so on. In the preceding example, the leaf unit was 1. In the preceding example, the leaf unit was 1. A single digit is used to define each leaf. A single digit is used to define each leaf.

10 10 Slide © 2006 Thomson/South-Western Example: Leaf Unit = 0.1 If we have data with values such as 8 9 10 11 Leaf Unit = 0.1 6 8 1 4 2 0 7 8.6 11.79.49.110.211.08.8 a stem-and-leaf display of these data will be

11 11 Slide © 2006 Thomson/South-Western Example: Leaf Unit = 10 If we have data with values such as 16 17 18 19 Leaf Unit = 10 8 1 9 0 3 1 7 1806171719741791168219101838 a stem-and-leaf display of these data will be The 82 in 1682 is rounded down to 80 and is represented as an 8.

12 12 Slide © 2006 Thomson/South-Western Crosstabulations and Scatter Diagrams Crosstabulation and a scatter diagram are two Crosstabulation and a scatter diagram are two methods for summarizing the data for two (or more) methods for summarizing the data for two (or more) variables simultaneously. variables simultaneously. Often a manager is interested in tabular and Often a manager is interested in tabular and graphical methods that will help understand the graphical methods that will help understand the relationship between two variables. relationship between two variables. Thus far we have focused on methods that are used Thus far we have focused on methods that are used to summarize the data for one variable at a time. to summarize the data for one variable at a time.

13 13 Slide © 2006 Thomson/South-Western Crosstabulation The left and top margin labels define the classes for The left and top margin labels define the classes for the two variables. the two variables. n Crosstabulation can be used when: one variable is qualitative and the other is one variable is qualitative and the other is quantitative, quantitative, both variables are qualitative, or both variables are qualitative, or both variables are quantitative. both variables are quantitative. A crosstabulation is a tabular summary of data for A crosstabulation is a tabular summary of data for two variables. two variables.

14 14 Slide © 2006 Thomson/South-Western Price Range Colonial Log Split A-Frame Total < $99,000 > $99,000 18 6 19 12 55 45 30 20 35 15 Total 100 12 14 16 3 Home Style Home Style Crosstabulation n Example: Finger Lakes Homes The number of Finger Lakes homes sold for each style and price for the past two years is shown below. quantitative variable variable qualitative

15 15 Slide © 2006 Thomson/South-Western Crosstabulation n Insights Gained from Preceding Crosstabulation Only three homes in the sample are an A-Frame Only three homes in the sample are an A-Frame style and priced at more than $99,000. style and priced at more than $99,000. The greatest number of homes in the sample (19) The greatest number of homes in the sample (19) are a split-level style and priced at less than or are a split-level style and priced at less than or equal to $99,000. equal to $99,000.

16 16 Slide © 2006 Thomson/South-Western PriceRange Colonial Log Split A-Frame Colonial Log Split A-Frame Total < $99,000 > $99,000 18 6 19 12 5545 30 20 35 15 Total 100 12 14 16 3 Home Style Home Style Crosstabulation Frequency distribution for the price variable Frequency distribution for the home style variable

17 17 Slide © 2006 Thomson/South-Western Crosstabulation: Row or Column Percentages n Converting the entries in the table into row percentages or column percentages can provide additional insight about the relationship between the two variables.

18 18 Slide © 2006 Thomson/South-Western PriceRange Colonial Log Split A-Frame Colonial Log Split A-Frame Total < $99,000 > $99,000 32.73 10.91 34.55 21.82 100100 Note: row totals are actually 100.01 due to rounding. 26.67 31.11 35.56 6.67 Home Style Home Style (Colonial and > $99K)/(All >$99K) x 100 = (12/45) x 100 Crosstabulation: Row Percentages

19 19 Slide © 2006 Thomson/South-Western PriceRange Colonial Log Split A-Frame Colonial Log Split A-Frame < $99,000 > $99,000 60.00 30.00 54.29 80.00 40.00 70.00 45.71 20.00 Home Style Home Style 100 100 100 100 Total (Colonial and > $99K)/(All Colonial) x 100 = (12/30) x 100 Crosstabulation: Column Percentages

20 20 Slide © 2006 Thomson/South-Western Crosstabulation: Simpson’s Paradox Simpson’ Paradox: In some cases the conclusions Simpson’ Paradox: In some cases the conclusions based upon an aggregated crosstabulation can be based upon an aggregated crosstabulation can be completely reversed if we look at the unaggregated completely reversed if we look at the unaggregated data. suggests the overall relationship between the data. suggests the overall relationship between the variables. variables. We must be careful in drawing conclusions about the We must be careful in drawing conclusions about the relationship between the two variables in the relationship between the two variables in the aggregated crosstabulation. aggregated crosstabulation. Data in two or more crosstabulations are often Data in two or more crosstabulations are often aggregated to produce a summary crosstabulation. aggregated to produce a summary crosstabulation.

21 21 Slide © 2006 Thomson/South-Western The general pattern of the plotted points suggests the The general pattern of the plotted points suggests the overall relationship between the variables. overall relationship between the variables. One variable is shown on the horizontal axis and the One variable is shown on the horizontal axis and the other variable is shown on the vertical axis. other variable is shown on the vertical axis. A scatter diagram is a graphical presentation of the A scatter diagram is a graphical presentation of the relationship between two quantitative variables. relationship between two quantitative variables. Scatter Diagram and Trendline A trendline is an approximation of the relationship. A trendline is an approximation of the relationship.

22 22 Slide © 2006 Thomson/South-Western Scatter Diagram n A Positive Relationship x y

23 23 Slide © 2006 Thomson/South-Western Scatter Diagram n A Negative Relationship x y

24 24 Slide © 2006 Thomson/South-Western Scatter Diagram n No Apparent Relationship x y

25 25 Slide © 2006 Thomson/South-Western Example: Panthers Football Team n Scatter Diagram The Panthers football team is interested The Panthers football team is interested in investigating the relationship, if any, between interceptions made and points scored. 1 3 2 1 3 14 24 18 17 30 x = Number of Interceptions y = Number of Points Scored Points Scored

26 26 Slide © 2006 Thomson/South-Western Scatter Diagram y x Number of Interceptions Number of Points Scored 5 10 15 20 25 30 035 12304

27 27 Slide © 2006 Thomson/South-Western n Insights Gained from the Preceding Scatter Diagram The relationship is not perfect; all plotted points in The relationship is not perfect; all plotted points in the scatter diagram are not on a straight line. the scatter diagram are not on a straight line. Higher points scored are associated with a higher Higher points scored are associated with a higher number of interceptions. number of interceptions. The scatter diagram indicates a positive relationship The scatter diagram indicates a positive relationship between the number of interceptions and the between the number of interceptions and the number of points scored. number of points scored. Example: Panthers Football Team

28 28 Slide © 2006 Thomson/South-Western Tabular and Graphical Procedures Qualitative Data Quantitative Data Tabular TabularMethods Methods Methods MethodsGraphical Methods MethodsGraphical Graphical Graphical FrequencyFrequency Distribution Distribution Rel. Freq. Dist.Rel. Freq. Dist. Percent Freq.Percent Freq. Distribution Distribution CrosstabulationCrosstabulation Bar GraphBar Graph Pie ChartPie Chart FrequencyFrequency Distribution Distribution Rel. Freq. Dist.Rel. Freq. Dist. Cum. Freq. Dist.Cum. Freq. Dist. Cum. Rel. Freq.Cum. Rel. Freq. Distribution Distribution Stem-and-LeafStem-and-Leaf Display Display CrosstabulationCrosstabulation Dot PlotDot Plot HistogramHistogram OgiveOgive ScatterScatter Diagram Diagram DataData

29 29 Slide © 2006 Thomson/South-Western End of Chapter 2, Part B


Download ppt "1 1 Slide © 2006 Thomson/South-Western Chapter 2 Descriptive Statistics: Tabular and Graphical Presentations Part B n Exploratory Data Analysis n Crosstabulations."

Similar presentations


Ads by Google