Fu Jen Catholic University

Slides:



Advertisements
Similar presentations
X y Exploratory data analysis Cross tabulations and scatter diagrams.
Advertisements

1 1 Slide © 2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
1 1 Slide © University of Minnesota-Duluth, Summer 2009-Econ-2030(Dr. Tadesse) Chapter 2 Descriptive Statistics.
1 1 Slide © University of Minnesota-Duluth, Summer-2009 Econ-2030(Dr. Tadesse) Chapter-2: Descriptive Statistics: Tabular and Graphical Presentations Part.
1/54 Statistics Descriptive Statistics— Tables and Graphics.
QMS 6351 Statistics and Research Methods Chapter 2 Descriptive Statistics: Tabular and Graphical Methods Prof. Vera Adamchik.
1 1 Slide IS 310 – Business Statistics IS 310 Business Statistics CSU Long Beach.
1 1 Slide © 2006 Thomson/South-Western Chapter 2 Descriptive Statistics: Tabular and Graphical Presentations Part A n Summarizing Qualitative Data n Summarizing.
1 1 Slide © 2009 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
X y Exploratory data analysis Cross tabulations and scatter diagrams.
1 1 Slide IS 310 – Business Statistics IS 310 Business Statistics CSU Long Beach.
© 2006 by Thomson Learning, a division of Thomson Asia Pte Ltd.. 1 Slide Slide Slides Prepared by Juei-Chao Chen Fu Jen Catholic University Slides Prepared.
1 1 Slide 統計學 Fall 2003 授課教師:統計系余清祥 日期: 2003 年 9 月 23 日 第二週:敘述性統計量.
Econ 3790: Business and Economics Statistics
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 2 Descriptive Statistics: Tabular and Graphical Methods.
1 1 Slide IS 310 – Business Statistics IS 310 Business Statistics CSU Long Beach.
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license.
Copyright © 2008 Pearson Education, Inc.
1 1 Slide © 2006 Thomson/South-Western Chapter 2 Descriptive Statistics: Tabular and Graphical Presentations Part B n Exploratory Data Analysis n Crosstabulations.
DATA FROM A SAMPLE OF 25 STUDENTS ABBAB0 00BABB BB0A0 A000AB ABA0BA.
Business Statistics **** Management Information Systems Business Statistics Third level First mid-term: Instructor: Dr. ZRELLI Houyem Majmaah.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Chapter 2 Descriptive Statistics: Tabular and Graphical Methods.
Business Statistics: Communicating with Numbers By Sanjiv Jaggia and Alison Kelly McGraw-Hill/Irwin Copyright © 2013 by The McGraw-Hill Companies, Inc.
Descriptive Statistics: Tabular and Graphical Presentations n Summarizing Qualitative Data n Summarizing Quantitative Data.
1 1 Slide STATISTICS FOR BUSINESS AND ECONOMICS Seventh Edition AndersonSweeneyWilliams Slides Prepared by John Loucks © 1999 ITP/South-Western College.
1 1 Slide © 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
 Frequency Distribution is a statistical technique to explore the underlying patterns of raw data.  Preparing frequency distribution tables, we can.
BIA 2610 – Statistical Methods Chapter 2 – Descriptive Statistics: Tabular and Graphical Displays.
1 1 Slide © 2005 Thomson/South-Western Introduction to Statistics Chapter 2 Descriptive Statistics.
Chapter 2 – Descriptive Statistics
Chapter 2, Part A Descriptive Statistics: Tabular and Graphical Presentations n Summarizing Categorical Data n Summarizing Quantitative Data Categorical.
1 1 Slide © 2005 Thomson/South-Western OPIM 303-Lecture #1 Jose M. Cruz Assistant Professor.
1 1 Slide Slides by JOHN LOUCKS St. Edward’s University.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 2 Descriptive Statistics: Tabular and Graphical Methods.
ISTANBUL STOCK EXCHANGE (BIST) FELL 6 POINTS IN AVERAGE TODAY THE UNITED STATES DOLLAR (USD) APPRECIATED BY 4 PERCENT LAST WEEK AGAINST TURKISH LIRA (TRL).
Fundamentals of Business Statistics chapter2 descriptive statistics: tabular and graphical presentations.
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Graphing options for Quantitative Data
Descriptive Statistics: Tabular and Graphical Methods
Descriptive Statistics: Tabular and Graphical Methods
2.2 More Graphs and Displays
Summarizing Categorical Data
CHAPTER 2 : DESCRIPTIVE STATISTICS: TABULAR & GRAPHICAL PRESENTATION
Chapter 2 Descriptive Statistics
Chapter 2 Descriptive Statistics: Tabular and Graphical Methods
CHAPTER 1: Picturing Distributions with Graphs
THE STAGES FOR STATISTICAL THINKING ARE:
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Basic Practice of Statistics - 3rd Edition
CHAPTER 1 Exploring Data
THE STAGES FOR STATISTICAL THINKING ARE:
Basic Practice of Statistics - 3rd Edition
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Essentials of Statistics for Business and Economics (8e)
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
OAD30763 Statistics in Business and Economics
CHAPTER 1 Exploring Data
Organizing, Displaying and Interpreting Data
Design by : Ms Sheema Aftab
CHAPTER 1 Exploring Data
Presentation transcript:

Fu Jen Catholic University Slides Prepared by Juei-Chao Chen Fu Jen Catholic University

Chapter 2 Descriptive Statistics: Tabular and Graphical Presentations Part B Exploratory Data Analysis Crosstabulations and Scatter Diagrams x y

2.3 Exploratory Data Analysis The techniques of exploratory data analysis consist of simple arithmetic and easy-to-draw pictures that can be used to summarize data quickly. One such technique is the stem-and-leaf display.

Stem-and-Leaf Display A stem-and-leaf display shows both the rank order and shape of the distribution of the data. It is similar to a histogram on its side, but it has the advantage of showing the actual data values. The first digits of each data item are arranged to the left of a vertical line. To the right of the vertical line we record the last digit for each item in rank order. Each line in the display is referred to as a stem. Each digit on a stem is a leaf.

Stem-and-Leaf Display Example: Number of Questions Answered Correctly on An Aptitude Test

Stem-and-Leaf Display Stem: The numbers to the left of the vertical line (6, 7, 8, 9, 10, 11, 12, 13, and 14). Leaf: each digit to the right of the vertical line.

Stem-and-Leaf Display Although the stem-and-leaf display may appear to offer the same information as a histogram, it has two primary advantages. 1. The stem-and-leaf display is easier to construct by hand. 2. Within a class interval, the stem-and-leaf display provides more information than the histogram because the stem-and-leaf shows the actual data.

Example: Hudson Auto Repair The manager of Hudson Auto would like to have a better understanding of the cost of parts used in the engine tune-ups performed in the shop. She examines 50 customer invoices for tune-ups. The costs of parts, rounded to the nearest dollar, are listed on the next slide.

Example: Hudson Auto Repair Sample of Parts Cost for 50 Tune-ups

Stem-and-Leaf Display 5 6 7 8 9 10 2 7 2 2 2 2 5 6 7 8 8 8 9 9 9 1 1 2 2 3 4 4 5 5 5 6 7 8 9 9 9 0 0 2 3 5 8 9 1 3 7 7 7 8 9 1 4 5 5 9 a stem a leaf

Stretched Stem-and-Leaf Display If we believe the original stem-and-leaf display has condensed the data too much, we can stretch the display by using two stems for each leading digit(s). Whenever a stem value is stated twice, the first value corresponds to leaf values of 0 - 4, and the second value corresponds to leaf values of 5 - 9.

Stretched Stem-and-Leaf Display 5 6 7 8 9 10 2 7 2 2 2 2 5 6 7 8 8 8 9 9 9 1 1 2 2 3 4 4 5 5 5 6 7 8 9 9 9 0 0 2 3 5 8 9 1 3 7 7 7 8 9 1 4 5 5 9

Stem-and-Leaf Display Leaf Units A single digit is used to define each leaf. In the preceding example, the leaf unit was 1. Leaf units may be 100, 10, 1, 0.1, and so on. Where the leaf unit is not shown, it is assumed to equal 1.

a stem-and-leaf display of these data will be Example: Leaf Unit = 0.1 If we have data with values such as 8.6 11.7 9.4 9.1 10.2 11.0 8.8 a stem-and-leaf display of these data will be Leaf Unit = 0.1 8 9 10 11 6 8 1 4 2 0 7

a stem-and-leaf display of these data will be Example: Leaf Unit = 10 If we have data with values such as 1806 1717 1974 1791 1682 1910 1838 a stem-and-leaf display of these data will be Leaf Unit = 10 16 17 18 19 8 The 82 in 1682 is rounded down to 80 and is represented as an 8. 1 9 0 3 1 7

2.4 Crosstabulations and Scatter Diagrams Thus far we have focused on methods that are used to summarize the data for one variable at a time. Often a manager is interested in tabular and graphical methods that will help understand the relationship between two variables. Crosstabulation and a scatter diagram are two methods for summarizing the data for two (or more) variables simultaneously.

Crosstabulation A crosstabulation is a tabular summary of data for two variables. Crosstabulation can be used when: one variable is qualitative and the other is quantitative, both variables are qualitative, or both variables are quantitative. The left and top margin labels define the classes for the two variables.

Crosstabulation Example: Data from Zagat’s Restaurant Review. Data on a restaurant’s quality rating and typical meal price are reported. Quality rating is a qualitative variable with rating categories of good, very good, and excellent. Meal price is a quantitative variable that ranges from $10 to $49.

Crosstabulation

Crosstabulation Example: Crosstabulation of Quality Rating and Meal Price for 300 Los Angeles Restaurants

Colonial Log Split A-Frame Crosstabulation Example: Finger Lakes Homes The number of Finger Lakes homes sold for each style and price for the past two years is shown below. quantitative variable qualitative variable Home Style Price Range Colonial Log Split A-Frame Total < $99,000 > $99,000 18 6 19 12 55 45 12 14 16 3 Total 30 20 35 15 100

Crosstabulation Insights Gained from Preceding Crosstabulation The greatest number of homes in the sample (19) are a split-level style and priced at less than or equal to $99,000. Only three homes in the sample are an A-Frame style and priced at more than $99,000.

Crosstabulation Frequency distribution for the price variable Home Style Price Range Colonial Log Split A-Frame Total < $99,000 > $99,000 18 6 19 12 55 45 12 14 16 3 Total 30 20 35 15 100 Frequency distribution for the home style variable

Crosstabulation: Row or Column Percentages Converting the entries in the table into row percentages or column percentages can provide additional insight about the relationship between the two variables.

Crosstabulation: Row Percentages Price Range Home Style Colonial Log Split A-Frame Total 32.73 10.91 34.55 21.82 100 < $99,000 > $99,000 26.67 31.11 35.56 6.67 Note: row totals are actually 100.01 due to rounding. (Colonial and > $99K)/(All >$99K) × 100 = (12/45) × 100

Crosstabulation: Column Percentages Price Range Home Style Colonial Log Split A-Frame 60.00 30.00 54.29 80.00 < $99,000 > $99,000 40.00 70.00 45.71 20.00 Total 100 100 100 100 (Colonial and > $99K)/(All Colonial) × 100 = (12/30) × 100

Crosstabulation: Simpson’s Paradox Data in two or more crosstabulations are often aggregated to produce a summary crosstabulation. We must be careful in drawing conclusions about the relationship between the two variables in the aggregated crosstabulation. Simpson’ Paradox: In some cases the conclusions based upon an aggregated crosstabulation can be completely reversed if we look at the unaggregated data. suggests the overall relationship between the variables.

Scatter Diagram and Trendline A scatter diagram is a graphical presentation of the relationship between two quantitative variables. One variable is shown on the horizontal axis and the other variable is shown on the vertical axis. The general pattern of the plotted points suggests the overall relationship between the variables. A trendline is an approximation of the relationship.

Scatter Diagram and Trendline A Positive Relationship y x

Scatter Diagram and Trendline A Negative Relationship y x

Scatter Diagram and Trendline No Apparent Relationship y x

Example: Panthers Football Team Scatter Diagram The Panthers football team is interested in investigating the relationship, if any, between interceptions made and points scored. x = Number of Interceptions y = Number of Points Scored 1 3 2 14 24 18 17 30

Scatter Diagram y 5 10 15 20 25 30 35 Number of Points Scored x 1 2 3 4 Number of Interceptions

Example: Panthers Football Team Insights Gained from the Preceding Scatter Diagram The scatter diagram indicates a positive relationship between the number of interceptions and the number of points scored. Higher points scored are associated with a higher number of interceptions. The relationship is not perfect; all plotted points in the scatter diagram are not on a straight line.

Tabular and Graphical Procedures Data Qualitative Data Quantitative Data Tabular Methods Graphical Methods Tabular Methods Graphical Methods Bar Graph Pie Chart Frequency Distribution Rel. Freq. Dist. Percent Freq. Crosstabulation Dot Plot Histogram Ogive Scatter Diagram Frequency Distribution Rel. Freq. Dist. Cum. Freq. Dist. Cum. Rel. Freq. Stem-and-Leaf Display Crosstabulation

End of Chapter 2, Part B