Guests staying at Marada Inn were asked to rate the quality of their accommodations as being excellent, above average, average, below average, or poor. The ratings provided by a sample of 20 guests are: Below Average Above Average Average Above Average Average Above Average Average Above Average Below Average Poor Excellent Above Average Average Above Average Below Average Poor Above Average Average Example: Marada Inn 3 9 5 2 1 Categorical Data

1 Poor Below Average Average Above Average Excellent RatingFrequency Example: Marada Inn Total 20 2 3 5 9 Categorical Data

Poor Below Average Average Above Average Excellent 10 15 25 45 5 100 Relative Frequency Percent Frequency Rating Example: Marada Inn.05 1.00.10.15.25.45 1 Total 20 2 3 5 9 Frequency Categorical Data

Poor Below Average Above Average Excellent Frequency Rating Categorical Data 1 2 3 4 5 6 7 8 9 10 Marada Inn Quality Ratings

Below Average 15% Average 25% Above Average 45% Poor 10% Excellent 5% Marada Inn Quality Ratings Categorical Data

Example: Hudson Auto Repair The manager of Hudson Auto would like to have better understanding of the cost of parts used in the engine tune- ups performed in the shop. She examines 50 customer invoices for tune-ups. The costs of parts, rounded to the nearest dollar. 91789357755299809762 71697289667579757276 10474626897105776580109 85978868836871696774 6282981017910579696273 Quantitative Data

sorted 525762 65666768 69 71 72 73 74 75 76777879 80 82838588899193 97 9899101104105 109 minimum maximum Example: Hudson Auto Repair Quantitative Data

13 525762 65666768 69 71 72 73 74 75 76777879 80 82838588899193 97 9899101104105 109 2 16 7 7 5 50-59 60-69 70-79 80-89 90-99 100-109 Cost ($) Frequency Quantitative Data

2/50.04 Relative Freq 1.00 13/50 16/50 7/50 5/50.26.32.14.10 4 Percent Freq 100 26 32 14 10 Quantitative Data 50 50-59 60-69 70-79 80-89 90-99 100-109 Cost ($) Frequency 13 2 16 7 7 5

2 4 6 8 10 12 14 16 18 Parts Cost ($) Frequency 50 60 70 80 90 100 110 Tune-up Parts Cost Quantitative Data

Moderately Skewed Left Symmetric Highly Skewed Right Quantitative Data

Ogive for Hudson Auto Repair < 60 < 70 < 80 < 90 < 100 < 110 Cost ($) Cumulative Frequency 50-59 60-69 70-79 80-89 90-99 100-109 2 13 16 7 5 Parts Cost ($) Parts Frequency 50 2 15 31 38 45 50 Quantitative Data

Cumulative Relative Frequency Cumulative Percent Frequency 4 30 62 76 90 100.04.30.62.76.90 1.00 Quantitative Data Ogive for Hudson Auto Repair < 60 < 70 < 80 < 90 < 100 < 110 Cost ($) Cumulative Frequency 2 15 31 38 45 50

Parts Cost ($) 20 40 60 80 100 Cumulative Percent Frequency 50 60 70 80 90 100 110 ($90, 76%) Tune-up Parts Cost Example: Hudson Auto Repair ($50, 0%) ($60, 4%) ($70, 30%) ($80, 62%) ($100, 90%) ($110, 100%) Quantitative Data

Summarizing Two variables Finger Lakes Homes.xls Example: 18 6 19 12 55 30 100 12 14 16 3 Quantitative variable qualitative variable 45 203515 Price Range Colonial Log Split A-Frame Total < $99,000 > $99,000 Total Home Style

Price Range Colonial Log Split A-Frame Total < $99,000 > $99,000 18 6 19 12 55 30 Total 100 12 14 16 3 Home Style 45 203515 Summarizing Two variables Finger Lakes Homes.xls Example:

Price Range Colonial Log Split A-Frame Total < $99,000 > $99,000 Home Style Price Range Colonial Log Split A-Frame Total < $99,000 > $99,000 18 6 19 12 55 30 Total 100 12 14 16 3 Home Style 45 203515 0.3273 1.0000 0.10910.34550.2182 0.2667 1.0000 0.31110.35560.0667 Summarizing Two variables

Price Range Colonial Log Split A-Frame Total < $99,000 > $99,000 Home Style Price Range Colonial Log Split A-Frame < $99,000 > $99,000 18 6 19 12 30 Total 12 14 16 3 Home Style 203515 0.6000 1.0000 0.4000 0.30 1.0000 0.70 0.5429 1.0000 0.4571 0.8000 1.0000 0.2000 Summarizing Two variables

AdmittedDeniedTotal Male373847048442 Female149428274321 Total5232753112763 AdmittedDeniedTotal Male0.29290.36860.6614 Female0.11710.22150.3386 Total0.40990.59011.0000 Dividing all of the frequencies above by the number of observations yields what the joint probability table below The crosstabulation for the aggregated UC-Berkley data is Male acceptance rate is higher when data is aggregated.

Female AdmittedDeniedTotal A8919108 B17825 Total10627133 Male AdmittedDeniedTotal A512313825 B313207520 Total8255201345 Female AdmittedDeniedTotal A0.82410.17591.0000 B0.68000.32001.0000 Male AdmittedDeniedTotal A0.62060.37941.0000 B0.60190.39811.0000 Compute the row percentages to show the Simpsons Paradox Summarizing Two variables data_simpson.xls

A Negative Relationship x y Q BigMacs P BigMacs 0.50 21 5.00 2 Summarizing Two variables

No Apparent Relationship y Q NoseHairTrimmers x P BigMacs Summarizing Two variables

Example: Panthers Football Team 1321313213 14 24 18 17 30 x = Number of Interceptions y = Number of Points Scored The Panthers football team is interested in investigating the relationship, if any, between interceptions made and points scored. Summarizing Two variables

y x Number of Interceptions Number of Points Scored 5 10 15 20 25 30 0 35 1 2304 Summarizing Two variables

data_pelican.xls Pelican Stores -- continued Pelican Stores is chain of womens apparel stores. It recently ran a promotion in which discount coupons were set to customers of other National Clothing stores. Data collected for a sample of 100 in-store credit card transactions at Pelican Stores during one day while the promotion was running are shown in Table 2.18. Customers who made a purchase using a discount coupon are referred to as promotional customers and customers who made a purchase but did not use a discount coupon are referred to as regular customers. Because the promotional coupons were not set to regular Pelican Stores customers, management considers the sales made to people presenting the promotional coupons as sales it would not otherwise make. Pelicans management would like to use this sample data to learn about its customer base and to evaluate the promotion involving discounts. Managerial Report 1.Using graphs and tables, summarize the qualitative variables. 2.Using graphs and tables, summarize the quantitative variables. 3.Using pivot tables and scatter plots, summarize the variables.

