Download presentation
Presentation is loading. Please wait.
1
CHAPTER 1 Exploring Data
1.2 Displaying Quantitative Data with Graphs
2
Displaying Quantitative Data with Graphs
MAKE and INTERPRET dotplots and stemplots of quantitative data DESCRIBE the overall pattern of a distribution and IDENTIFY any outliers IDENTIFY the shape of a distribution MAKE and INTERPRET histograms of quantitative data COMPARE distributions of quantitative data
3
Ways to chart quantitative data
Histogram, stemplots, dotplots, and boxplots These are summary graphs for a single variable. They are very useful to understand the pattern of variability in the data. Line graphs: time plots Use when there is a meaningful sequence, like time. The line connecting the points helps emphasize any change over time.
4
Dotplots A dotplot is a simple display. It just places a dot along an axis for each case in the data. The dotplot to the right shows Kentucky Derby winning times, plotting each race as its own dot. You might see a dotplot displayed horizontally or vertically. Slide 4- 4
5
Displaying Quantitative Data
Examining the Distribution of a Quantitative Variable The purpose of a graph is to help us understand the data. After you make a graph, always ask, “What do I see?”?” Displaying Quantitative Data How to Examine the Distribution of a Quantitative Variable In any graph, look for the overall pattern and for striking departures from that pattern. Describe the overall pattern of a distribution by its: Shape Center Spread Note individual values that fall outside the overall pattern. These departures are called outliers. Don’t forget your SOCS! 5
6
Displaying Quantitative Data
Describing Shape When you describe a distribution’s shape, concentrate on the main features. Look for rough symmetry or clear skewness. Displaying Quantitative Data Definitions: A distribution is roughly symmetric if the right and left sides of the graph are approximately mirror images of each other. A distribution is skewed to the right (right-skewed) if the right side of the graph (containing the half of the observations with larger values) is much longer than the left side. It is skewed to the left (left-skewed) if the left side of the graph is much longer than the right side. Symmetric Skewed-left Skewed-right 6
7
Displaying Quantitative Data
Examine this data The table and dotplot below displays the Environmental Protection Agency’s estimates of highway gas mileage in miles per gallon (MPG) for a sample of 24 model year 2009 midsize cars. Displaying Quantitative Data Describe the shape, center, and spread of the distribution. Are there any outliers? 7
8
Displaying Quantitative Data
Comparing Distributions Some of the most interesting statistics questions involve comparing two or more groups. Always discuss shape, center, spread, in context,and possible outliers whenever you compare distributions of a quantitative variable. Displaying Quantitative Data Example, page 30 Compare the distributions of household size for these two countries. Don’t forget your SOCS! U.K South Africa Place 8
9
Displaying Quantitative Data
Stemplots (Stem-and-Leaf Plots) Another simple graphical display for small data sets is a stemplot. Stemplots give us a quick picture of the distribution while including the actual numerical values. Displaying Quantitative Data How to Make a Stemplot Separate each observation into a stem (all but the final digit) and a leaf (the final digit). Write all possible stems from the smallest to the largest in a vertical column and draw a vertical line to the right of the column. Write each leaf in the row to the right of its stem. Arrange the leaves in increasing order out from the stem. Provide a key that explains in context what the stems and leaves represent.
10
Stemplots These data represent the responses of 20 female AP Statistics students to the question, “How many pairs of shoes do you have?” Construct a stemplot. 50 26 31 57 19 24 22 23 38 13 34 30 49 15 51 Stems 1 2 3 4 5 Add leaves 4 9 Order leaves 4 9 Add a key Key: 4|9 represents a female student who reported having 49 pairs of shoes.
11
Stemplots When data values are “bunched up”, we can get a better picture of the distribution by splitting stems. Two distributions of the same quantitative variable can be compared using a back-to-back stemplot with common stems. Females Males 50 26 31 57 19 24 22 23 38 13 34 30 49 15 51 14 7 6 5 12 38 8 10 11 4 22 35 Females 333 95 4332 66 410 8 9 100 7 Males 0 4 1 2 2 2 3 3 58 4 5 1 2 3 4 5 “split stems” Key: 4|9 represents a student who reported having 49 pairs of shoes.
12
Stemplots versus histograms
Stemplots are quick and dirty histograms that can easily be done by hand, therefore very convenient for smaller data sets. However, they are rarely found in scientific or laymen publications. When might you NOT want to use a Stemplot?
13
Displaying quantitative data: Histograms
Displays counts or percents Shows trend of data User defines number of classes Good for large data sets Does not display actual data values The bars have the same width and always touch (the edges of the bars are on class boundaries which are described below). The width of a bar represents a quantitative variable x, such as age rather than a category. The height of each bar indicates frequency. Quantitative variables often take many values. A graph of the distribution may be clearer if nearby values are grouped together. The most common graph of the distribution of one quantitative variable is a histogram.
14
Displaying Quantitative Data
How to Make a Histogram Divide the range of data into classes of equal width. Find the count (frequency) or percent (relative frequency) of individuals in each class. Label and scale your axes and draw the histogram. The height of the bar equals its frequency. Adjacent bars should touch, unless a class contains no individuals. Displaying Quantitative Data To find the class width, First compute: Largest value - Smallest Value Desired number of classes Increase the value computed to the next highest whole, number even if the first value was a whole number. This will ensure the classes cover the data.
15
How to create a histogram
It is an iterative process – try and try again. What bin size should you use? Not too many bins with either 0 or 1 counts Not overly summarized that you loose all the information Not so detailed that it is no longer a summary rule of thumb: start with 5 to10 bins Look at the distribution and refine your bins (There isn’t a unique or “perfect” solution)
16
Using the TI-83 to make histograms
The TI-83 can be used to make histograms, and will allow you to change the location and widths of the ranges. Turn to Page 36 in your textbook and follow the directions in the Technology Corner. 16
17
Using the TI-83 to make histogramsI
You can also change the size and location of the ranges by using the Window button Use the scale key to change the number of classes. Enter the CLASS WIDTH. Press the Graph button to see the results 17
18
When do we use the frequency key?
Suppose that the distribution of scores for a class on the AP test were: SCORE FREQUENCY 1 3 2 5 14 4 6
19
Be sure to choose classes all the same width.
Histogram Tips Be sure to choose classes all the same width. Use your judgment in choosing classes to display the shape. Too few classes will give a 'skyskaper' graph; Too many will produce a 'pancake' graph.
20
Same data set Not summarized enough Too summarized
21
Describing the Shape of a Histogram
Does the histogram have a single, central hump or several separated bumps? Humps in a histogram are called modes. A histogram with one main peak is dubbed unimodal; histograms with two peaks are bimodal; histograms with three or more peaks are called multimodal . Slide 4- 21
22
Humps and Bumps (cont.) A histogram that doesn’t appear to have any mode and in which all the bars are approximately the same height is called uniform: Slide 4- 22
23
Most common distribution shapes
Symmetric distribution A distribution is symmetric if the right and left sides of the histogram are approximately mirror images of each other. Skewed distribution A distribution is skewed to the right if the right side of the histogram (side with larger values) extends much farther out than the left side. It is skewed to the left if the left side of the histogram extends much farther out than the right side. Complex, multimodal distribution Not all distributions have a simple overall shape, especially when there are few observations.
24
Anything Unusual? Don’t forget to make note of any unusual
features denoted in the shape of the distribution. Sometimes it’s the unusual features that tell us something interesting or exciting about the data. You should always mention any stragglers, or outliers, that stand off away from the body of the distribution. Are there any gaps in the distribution? If so, we might have data from more than one group. Slide 4- 24
25
Outliers Always look for outliers and try to explain them.
For Example: The overall pattern is fairly symmetrical except for 2 states clearly not belonging to the main trend. Alaska and Florida have unusual representation of the elderly in their population. A large gap in the distribution is typically a sign of an outlier. This is from the book. Imagine you are doing a study of health care in the 50 US states, and need to know how they differ in terms of their elderly population. This is a histogram of the number of states grouped by the percentage of their residents that are 65 or over. You can see there is one very small number and one very large number, with a gap between them and the rest of the distribution. Values that fall outside of the overall pattern are called outliers. They might be interesting, they might be mistakes - I get those in my data from typos in entering RNA sequence data into the computer. They might only indicate that you need more samples. Will be paying a lot of attention to them throughout class both for what we can learn about biology and also because they can cause trouble with your statistics. Guess which states they are (florida and alaska). Alaska Florida 25
26
Histograms are Similar to Bar Graphs and so:
A relative frequency histogram displays the percentage of cases in each bin instead of the count. Relative frequency histograms qre good for comparing distributions of unequal counts Slide 4- 26
27
Notice the shape does not change when comparing frequency and relative frequency Histograms
AP Statistics, Section 1.1, Part 4 27 27
28
Displaying Quantitative Data
Using Histograms Wisely Here are several cautions based on common mistakes students make when using histograms. Displaying Quantitative Data Cautions Although they are similar, don’t confuse histograms and bar graphs. Don’t use counts (in a frequency table) or percents (in a relative frequency table) as data. Use percents instead of counts on the vertical axis when comparing distributions with different numbers of observations. Just because a graph looks nice, it’s not necessarily a meaningful display of data.
29
Constructing Frequency Polygons
Make a frequency table that includes class midpoints and frequencies. For each class place dots above class midpoint at the height of the class frequency. Put dots on horizontal axis one class width to left of first class midpoint, and one class width to right of of last midpoint. Connect dots with straight lines.
30
Frequency Polygon
31
Line graphs: time plots
In a time plot, time always goes on the horizontal, x axis. We describe time series by looking for an overall pattern and for striking deviations from that pattern. In a time series: A trend is a rise or fall that persist over time, despite small irregularities. A pattern that repeats itself at regular intervals of time is called seasonal variation.
32
Retail price of fresh oranges over time
Time is on the horizontal, x axis. The variable of interest—here “retail price of fresh oranges”— goes on the vertical, y axis. This time plot shows a regular pattern of yearly variations. These are seasonal variations in fresh orange pricing most likely due to similar seasonal variations in the production of fresh oranges. There is also an overall upward trend in pricing over time. It could simply be reflecting inflation trends or a more fundamental change in this industry.
33
A time plot can be used to compare two or more data sets covering the same time period.
The pattern over time for the number of flu diagnoses closely resembles that for the number of deaths from the flu, indicating that about 8% to 10% of the people diagnosed that year died shortly afterward from complications of the flu.
34
Scales matter How you stretch the axes and choose your scales can give a different impression. A picture is worth a thousand words, BUT There is nothing like hard numbers. Look at the scales.
35
Section 1.2 Displaying Quantitative Data with Graphs
Summary In this section, we learned that… You can use a dotplot, stemplot, or histogram to show the distribution of a quantitative variable. When examining any graph, look for an overall pattern and for notable departures from that pattern. Describe the shape, center, spread, and any outliers. Don’t forget your SOCS! Some distributions have simple shapes, such as symmetric or skewed. The number of modes (major peaks) is another aspect of overall shape. When comparing distributions, be sure to discuss shape, center, spread, and possible outliers. Histograms are for quantitative data, bar graphs are for categorical data. Use relative frequency histograms when comparing data sets of different sizes.
36
Looking Ahead… In the next Section…
We’ll learn how to describe quantitative data with numbers. Mean and Standard Deviation Median and Interquartile Range Five-number Summary and Boxplots Identifying Outliers We’ll also learn how to calculate numerical summaries with technology and how to choose appropriate measures of center and spread.
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.