Presentation on theme: "Chapter 11 STA 200 Summer I 2011. Histograms Bar graphs and pie charts are appropriate graphs for categorical variables. To display the distribution of."— Presentation transcript:
Histograms Bar graphs and pie charts are appropriate graphs for categorical variables. To display the distribution of a quantitative variable graphically, we use a histogram. In most cases, quantitative variables take way too many values to have a separate bar for each value. Instead, we’ll have to group values together into intervals, or classes.
Creating a Histogram Divide the range of data into classes of equal width. Then, count the number of individuals in each class. When drawing the histogram: – the variable goes on the horizontal axis – the rate or count of occurrences goes on the vertical axis – the height of each bar is determined by the rate or count of occurrences – the bars should touch
Example Suppose we have the following exam grades: 9969713262 8161678090 6283598255 7795749358 7172769551 8586771567 5568758279 7376689697
Example (cont.) Step 1: Choosing Classes – Make sure all of your classes are the same size. – Make sure your classes cover all of the data. – Make sure that none of your classes overlap. Here, we’re going to set up the classes in the most accessible manner: – 90-99 – 80-89 – 70-79 – etc.
Things to Look For Outliers: – observations outside the overall pattern of data – either significantly higher or significantly lower than the rest of the data Shape – roughly symmetric, left-skewed, or right-skewed
Outliers In the exam score example, are there any outliers? In a histogram, the outliers will usually stand out:
Shape If a distribution is roughly symmetric, the left and right sides will be approximate mirror images of each other. If a distribution is skewed, one tail will be longer than the other. – left-skewed: long left tail – right-skewed: long right tail
More Shape Some types of data regularly produce distributions with a specific shape. – Symmetric (not necessarily bell-shaped): the size of organisms of the same species, IQ scores – Right-skewed: income – Left-skewed: exam scores (usually)
Stem-and-Leaf Plots A quick, easy alternative to a histogram is a stem-and-leaf plot. For small data sets, a stem-and-leaf plot may be preferable to a histogram: – stem-and-leaf plots are quicker to make – they show more information than a histogram
Constructing a Stem-and-Leaf Plot Separate each observation into a stem and a leaf. – The leaf is the final digit. – The stem is everything but the final digit. Write the stems in a vertical column with the smallest one on top. Write each leaf in the row next to the appropriate stem. The leaves in each row should increase as you get farther away from the stem. If you have to round off the observations, be sure to use a key or legend to explain the rounding.
Example The following data represent the salaries (in millions) of a major league baseball team: Create a stem-and-leaf plot of the data. 3.04.00.42.6 0.80.41.54.1 0.47.05.01.5 2.02.915.00.9 0.40.54.05.1 220.127.116.11.4 1.618.104.22.168
Example (cont.) StemLeaf What shape does the distribution have?