Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 4 Displaying Quantitative Data. Quantitative variables Quantitative variables- record measurements or amounts of something. Must have units or.

Similar presentations


Presentation on theme: "Chapter 4 Displaying Quantitative Data. Quantitative variables Quantitative variables- record measurements or amounts of something. Must have units or."— Presentation transcript:

1 Chapter 4 Displaying Quantitative Data

2 Quantitative variables Quantitative variables- record measurements or amounts of something. Must have units or a variable in which the numbers act as numerical values

3 Types of Displays Histogram Stem and Leaf Displays Dotplots

4 Histogram A histogram uses adjacent bars to show the distribution of values in a quantitative variable. Looks very similar to a bar graph but there are differences. The horizontal axis is continuous not just labeled.

5 An example The histogram shown below gives the number of children visited a particular zoo..

6 Histogram A histogram is more convenient than a dot-plot or a stem and leaf plot because you don't have to represent each data point. However, you don't get to see the value of each data point. So a table of data and summary statistics would help people interpret the data.

7 Be Careful A histogram gives the number of data points that fall into equal intervals. Care must be taken in choosing the intervals because it can affect the shape of the graph and misrepresent the true data.

8 1 st graph The first graph is uses intervals of size 10 yielding the intervals 40-50, 50-60, etc. In this case, Yemen had a life expectancy of 50 and was placed in the 50-60 column. Usually, borderline values are placed in the higher column.

9 2 nd Graph In the second graph, the intervals are 40- 45, 45-50, 50-55, etc. This affects the shape of the graph.

10 Stem and Leaf Displays Shows quantitative data values in a way that sketches the distribution of the data. The stem-and-leaf plot below shows the number of students enrolled in a dance class in the past 12 years. The number of students are 81, 84, 85, 86, 93, 94, 97, 100, 102, 103, 110, and 111.

11 Dotplot Graphs a dot for each case against a single axis Graph the following number 5, 5,5,5,5,5,5,10,10,10,10,10 etc

12 Dotplot with two sets of data Example

13 Shape To describe the shape of a distribution, look for Symmetry versus skewness Single versus multiple modes

14 Symmetrical A distribution is symmetric if the two halves on either side of the center look approximately like the mirror images of each other.

15 Symmetrical Symmetrical Histogram

16 Dotplot Dots are mirrored images

17 Stem and leaf Example

18 Skewed A distribution is skewed if it is not symmetric and one tail stretched out further than the other. Skewed left- when the longer tail stretches to the left. Skewed right-when the longer tail stretched to the right

19 Examples Skewed right

20 Skewed left Left

21 All three Examples

22 Funny example http://www.herkimershideaway.org/apstati stics/ymmsum99/ymm111.htm

23 New seating chart http://www.random.org/integers/

24 Stem-and-Leaf Revisited Compare the histogram and stem-and- leaf display for the pulse rates of 24 women at a health clinic. Which graphical display do you prefer?

25 Think Before You Draw, Again Remember the “Make a picture” rule? Now that we have options for data displays, you need to Think carefully about which type of display to make. Before making a stem-and-leaf display, a histogram, or a dotplot, check the Quantitative Data Condition: The data are values of a quantitative variable whose units are known.

26 Constructing a Stem-and-Leaf Display First, cut each data value into leading digits (“stems”) and trailing digits (“leaves”). Use the stems to label the bins. Use only one digit for each leaf— either round or truncate the data values to one decimal place after the stem.

27 Center A value that attempts the impossible by summarizing the entire distribution with a single number, a “typical” value. Measures include the mean and median.

28 Spread A numerical summary of how tightly the values are clustered around the center. Measures of spread include the IQR and standard deviation.

29 Mode a hump or local high pint in the shape of the distribution of a variable is called the mode. The apparent location of modes can change as the scale of a histogram is changed

30 Unimodal Having one mode.

31 Bimodal Distribution with two modes

32 Example

33 Uniform A histogram that doesn’t appear to have any mode and in which all the bars are approximately the same height is called uniform:

34 Anything Unusual? Do any unusual features stick out? Sometimes it’s the unusual features that tell us something interesting or exciting about the data. You should always mention any stragglers, or outliers, that stand off away from the body of the distribution. Are there any gaps in the distribution? If so, we might have data from more than one group.

35 Outliers The following histogram has outliers— there are three cities in the leftmost bar:

36 Outliers Are extreme values that do not appear to belong to the rest of the data. They may be unusual values that deserve further investigation, or they may be just mistakes; there’s no obvious way to tell. Do not delete them. Outliers can affect many statistical analyses, so you should always be alert to them.

37 Outliers Away from the main portion of data

38 Where is the Center of the Distribution? If you had to pick a single number to describe all the data what would you pick? It’s easy to find the center when a histogram is unimodal and symmetric—it’s right in the middle. On the other hand, it’s not so easy to find the center of a skewed histogram or a histogram with more than one mode. For now, we will “eyeball” the center of the distribution. In the next chapter we will find the center numerically.

39 How Spread Out is the Distribution? Variation matters, and Statistics is about variation. Are the values of the distribution tightly clustered around the center or more spread out? In the next two chapters, we will talk about spread…

40 Comparing Distributions Often we would like to compare two or more distributions instead of looking at one distribution by itself. When looking at two or more distributions, it is very important that the histograms have been put on the same scale. Otherwise, we cannot really compare the two distributions. When we compare distributions, we talk about the shape, center, and spread of each distribution.

41 Example Compare the following distributions of ages for female and male heart attack patients: Compare the following distributions of ages for female and male heart attack patients:

42 HOMEWORK!!!!

43 Web Pages Used http://www.fao.org/wairdocs/ilri/x5469e/x5469e3 8.gifhttp://www.fao.org/wairdocs/ilri/x5469e/x5469e3 8.gif http://www.sciencebuddies.org/science-fair- projects/descriptive_statistics_files/BimodalDist.j pghttp://www.sciencebuddies.org/science-fair- projects/descriptive_statistics_files/BimodalDist.j pg http://images.absoluteastronomy.com/images/en cyclopediaimages/b/bi/bimodal.pnghttp://images.absoluteastronomy.com/images/en cyclopediaimages/b/bi/bimodal.png http://upload.wikimedia.org/wikipedia/commons/ b/bc/Bimodal_geological.PNG

44 Web Pages Used http://mathworld.wolfram.com/images/eps- gif/OutlierHistogram_1000.gif

45 Timeplots: Order, Please! For some data sets, we are interested in how the data behave over time. In these cases, we construct timeplots of the data.

46 *Re-expressing Skewed Data to Improve Symmetry

47 *Re-expressing Skewed Data to Improve Symmetry (cont.) One way to make a skewed distribution more symmetric is to re-express or transform the data by applying a simple function (e.g., logarithmic function). Note the change in skewness from the raw data (Figure 4.11) to the transformed data (Figure 4.12):

48 What Can Go Wrong? Don’t make a histogram of a categorical variable— bar charts or pie charts should be used for categorical data. Don’t look for shape, center, and spread of a bar chart.

49 What Can Go Wrong? (cont.) Don’t use bars in every display—save them for histograms and bar charts. Below is a badly drawn timeplot and the proper histogram for the number of eagles sighted in a collection of weeks:

50 What Can Go Wrong? (cont.)

51 Choose a bin width appropriate to the data. Changing the bin width changes the appearance of the histogram:

52 What Can Go Wrong? (cont.) Avoid inconsistent scales, either within the display or when comparing two displays. Label clearly so a reader knows what the plot displays. Good intentions, bad plot:

53 What have we learned? We’ve learned how to make a picture for quantitative data to help us see the story the data have to Tell. We can display the distribution of quantitative data with a histogram, stem-and-leaf display, or dotplot. Tell about a distribution by talking about shape, center, spread, and any unusual features. We can compare two quantitative distributions by looking at side-by-side displays (plotted on the same scale). Trends in a quantitative variable can be displayed in a timeplot.


Download ppt "Chapter 4 Displaying Quantitative Data. Quantitative variables Quantitative variables- record measurements or amounts of something. Must have units or."

Similar presentations


Ads by Google