A qualitative variable with three classes (X, Y, and Z) is measured for each of 20 units randomly sampled from a target population. The data (observed class for each unit are as follows: ClassesFrequencyRelative Frequency Class Percentage X 0.4 Y Z3 Total20
TYPES OF GRAPHS Three of the most widely used graphical methods for describing qualitative data: bar graphs, pie chart, and Pareto Diagram.
When using a bar graph, the categories (classes) of the qualitative variable are represented by bars, where the height of each bar is either the class frequency, class relative frequency, or class percentage.
Bar Graph ClassesFrequency Relative Frequency Percentage X80.440% Y90.4545% Z30.1515% Total201100%
Bar Chart Restaurant Complaints ComplaintCount Overpriced789 Small portions621 Wait time109 Food is tasteless65 No atmosphere45 Not clean30 Too noisy27 Food is too salty15 Unfriendly staff12 Food not fresh9 Frequency Bar Chart
Pie Chart Classes Relative Frequency X0.4 Y0.45 Z0.15 Total1
Pie Chart Restaurant Complaints ComplaintCount Percent Overpriced789 45.81882 Small portions621 36.06272 Wait time109 6.329849 Food is tasteless65 3.774681 No atmosphere45 2.61324 Not clean30 1.74216 Too noisy27 1.567944 Food is too salty15 0.87108 Unfriendly staff12 0.696864 Food not fresh9 0.522648
Pareto diagram is a bar graph with the categories (classes) of the qualitative variable (i.e., the bars) arranged by height in descending order from left to right. The goal of the Pareto diagram is to make it easy to locate the “most important” categories – those with the largest frequencies.
Pareto Diagram ComplaintCountCumulative % Overpriced78945.8 Small portions62181.9 Wait time10988.2 Food is tasteless6592.0 No atmosphere4594.6 Not clean3096.3 Too noisy2797.9 Food is too salty1598.8 Unfriendly staff1299.5
Misleading Graphs A survey was conducted to determine what food would be served at the French Club Party. Explain how the following graph is misleading.
The ratio of the heights of bars within each category does not reflect the actual ratio. There is an implied precision that is unrealistic. (To the penny really?) The percentages are computed incorrectly. A doubling of costs is only 100% increase.
1-29 Lesson Objectives You will be able to: 1.Describe Quantitative Data with Graphs 2.Use Summation Notation 3.Understanding Central Tendency
Lesson Objective # 2: Describe Quantitative Data with Graphs
Recall that quantitative data sets consist of data that are recorded on a meaningful numerical scale. To describe, summarize, and detect patterns in such data, we can use three graphical methods: dot plots, stem-and-leaf displays and histograms.
Test Scores 5659646569707173767778 8082 858691 92 959899
Dot Plot When using a dot plot, the numerical value of each quantitative measurement in the data set is represented by a dot on a horizontal scale. When data values repeat, the dots are placed above one another vertically. The dot plot condenses the data by grouping all value that are the same.
Stem-and-Leaf The stem-and-leaf display condenses the data by grouping all data with the same stem. The possible stems are listed in order in a column. The leaf for each quantitative measurement in the data set is placed in the corresponding stem row. Leaves for observations with the same stem value are listed in increasing order horizontally.
Decimal point is 1 digit(s) to the right of the colon. 5 : 69 6 : 4 6 : 59 7 : 013 7 : 678 8 : 022 8 : 56 9 : 1122 9 : 589
Histogram When using a histogram, the possible numerical values of the quantitative variable are partitioned into class intervals, each of which has the same width. These intervals form the scale of the horizontal axis. A vertical bar is placed over each class interval, with the height of the bar equal to either the class frequency or class relative frequency. When constructing histograms, use more classes as the number of values in the data set gets larger.
Lesson Objective # 4: Understanding Central Tendency
The central tendency of the set of measurements – that is, the tendency of the data to cluster, or center, about certain numerical values The variability of the set of measurements – that is, the spread of the data.
There are three measures of central tendency: mean, median and mode.
MEDIAN The median of a quantitative data is the middle number when the measurements are arranged in ascending (or descending) order.
Calculating a Sample Median Μ Arrange the n measurements from the smallest to the largest. 1.If n is odd, Μ is the middle number. 2.If n is even, Μ is the mean of the middle two numbers. NOTE: Remember to order the data before calculating a value for the median
MODE The mode is the measurement that occurs most frequently in the data set. Mode is the only measure of center that has to be an actual data value in the samples. NOTE: For some quantitative data sets, the mode may not be very meaningful.
The modal class is the measurement class containing the largest relative frequency. (Ex. Relative frequency histogram for quantitative data.)
SKEWED A data set is said to be skewed if one tail of the distribution has more extreme observations than the other tail.
Center of a Distribution – Mean The mean is the center of gravity because it is the point where the histogram balances:
Center of a Distribution – Mean The median is the value with exactly half the data values below it and half above it.
With rightward skewed data, the right tail (high end) of the distribution has more extreme observations. Conversely, with leftward skewed data, the left tail (low end) of the distribution has more extreme observations.