Excursions in Modern Mathematics, 7e: 14.2 - 2Copyright © 2010 Pearson Education, Inc. 14 Descriptive Statistics 14.1Graphical Descriptions of Data 14.2Variables.

Excursions in Modern Mathematics, 7e: 14.2 - 2Copyright © 2010 Pearson Education, Inc. 14 Descriptive Statistics 14.1Graphical Descriptions of Data 14.2Variables 14.3 Numerical Summaries 14.4Measures of Spread

Excursions in Modern Mathematics, 7e: 14.2 - 3Copyright © 2010 Pearson Education, Inc. Before we continue with our discussion of graphs, we need to discuss briefly the concept of a variable. In statistical usage, a variable is any characteristic that varies with the members of a population. The students in Dr. Blackbeard’s Stat 101 course (the population) did not all perform equally on the exam. Thus, the test score is a variable, which in this particular case is a whole number between 0 and 25. In some instances, such as when the instructor gives Variable

Excursions in Modern Mathematics, 7e: 14.2 - 4Copyright © 2010 Pearson Education, Inc. partial credit, a test score may take on a fractional value, such as 18.5 or 18.25. Even in these cases, however, the possible increments for the values of the variable are given by some minimum amount–a quarter- point, a half-point, whatever. In contrast to this situation, consider a different variable: the amount of time each student studied for the exam. In this case the variable can take on values that differ by any amount: an hour, a minute, a second, a tenth of a second, and so on. Variable

Excursions in Modern Mathematics, 7e: 14.2 - 5Copyright © 2010 Pearson Education, Inc. A variable that represents a measurable quantity is called a numerical (or quantitative) variable. When the difference between the values of a numerical variable can be arbitrarily small, we call the variable continuous (person’s height, weight, foot size, time it takes to run one mile); when possible values of the numerical variable change by minimum increments, the variable is called discrete (person’s IQ, SAT score, shoe size, score of a basketball game). Numerical Variable

Excursions in Modern Mathematics, 7e: 14.2 - 6Copyright © 2010 Pearson Education, Inc. Variables can also describe characteristics that cannot be measured numerically: nationality, gender, hair color, and so on. Variables of this type are called categorical (or qualitative) variables. Categorical Variable

Excursions in Modern Mathematics, 7e: 14.2 - 7Copyright © 2010 Pearson Education, Inc. In some ways, categorical variables must be treated differently from numerical variables– they cannot, for example, be added, multiplied, or averaged. In other ways, categorical variables can be treated much like discrete numerical variables, particularly when it comes to graphical descriptions, such as bar graphs and pictograms. Categorical Variable

Excursions in Modern Mathematics, 7e: 14.2 - 8Copyright © 2010 Pearson Education, Inc. Table 14-3 shows undergraduate enrollments in each of the five schools at Tasmania State Example 14.4Enrollments at Tasmania State University University. A sixth category (“other”) includes undeclared students, interdisciplinary majors, and so on.

Excursions in Modern Mathematics, 7e: 14.2 - 9Copyright © 2010 Pearson Education, Inc. Vertical and horizontal bar graphs displaying the data for table 14-3. Example 14.4Enrollments at Tasmania State University

Excursions in Modern Mathematics, 7e: 14.2 - 10Copyright © 2010 Pearson Education, Inc. When the number of categories is small, as is the case here, another common way to describe the relative frequencies of the categories is by using a pie chart. In a pie chart the “pie” represents the entire population (100%), and the “slices” represent the categories (or classes), with the size (angle) of each slice being proportional to the relative frequency of the corresponding category. Example 14.4Enrollments at Tasmania State University

Excursions in Modern Mathematics, 7e: 14.2 - 11Copyright © 2010 Pearson Education, Inc. Some relative frequencies, such as 50% and 25%, are very easy to sketch, but how do we accurately draw the slice corresponding to a more complicated frequency, say, 32.47%? Here, a little elementary geometry comes in handy. Since 100% equals 360º, 1% corresponds to an angle of 360º/100 = 3.6º. It follows that the frequency 32.47% is given by 32.47  3.6º = 117º (rounded to the nearest degree, which is generally good enough for most practical purposes). Example 14.4Enrollments at Tasmania State University

Excursions in Modern Mathematics, 7e: 14.2 - 12Copyright © 2010 Pearson Education, Inc. This figure shows an accurate pie chart for the school-enrollment data given in Table 14-3. Example 14.4Enrollments at Tasmania State University

Excursions in Modern Mathematics, 7e: 14.2 - 13Copyright © 2010 Pearson Education, Inc. The general rule in drawing pie charts is that a slice representing x% is given by an angle of (3.6)x degrees. PIE CHARTS

Excursions in Modern Mathematics, 7e: 14.2 - 14Copyright © 2010 Pearson Education, Inc. According to Nielsen Media Research data, the percentages of the TV audience watching TV during prime time (8 P.M. to 11 P.M.), broken up by age group, are as follows: adults (18 years and older), 63%; teenagers (12–17 years), 17%; children (2–11 years), 20%. Example 14.5Who’s Watching the Boob Tube Tonight?

Excursions in Modern Mathematics, 7e: 14.2 - 15Copyright © 2010 Pearson Education, Inc. The pie chart shows this breakdown of Example 14.5Who’s Watching the Boob Tube Tonight? audience composition by age group. A pie chart such as this one might be used to make the point that children and teenagers really do not watch as much TV as it is generally believed.

Excursions in Modern Mathematics, 7e: 14.2 - 16Copyright © 2010 Pearson Education, Inc. The problem with this conclusion is that children make up only 15% of the population at large and teens only 8%. In relative terms, a higher percentage of teenagers (taken out of the total teenage population) watch prime- time TV than any other group, with children second and adults last. Using absolute percentages can be quite misleading. When comparing characteristics of a population that is broken up into categories, it is essential to take into account the relative sizes of the various categories. Example 14.5Who’s Watching the Boob Tube Tonight?

Excursions in Modern Mathematics, 7e: 14.2 - 17Copyright © 2010 Pearson Education, Inc. When it comes to deciding how best to display graphically the frequencies of a population, a critical issue is the number of categories into which the data can fall. When the number of categories is too big (say, in the dozens), a bar graph or pictogram can become muddled and ineffective. This happens more often than not with numerical data–numerical variables can take on infinitely many values, and even when they don’t, the number of values can be too large for any reasonable graph. How Many Categories

Excursions in Modern Mathematics, 7e: 14.2 - 18Copyright © 2010 Pearson Education, Inc. The college dreams and aspirations of millions of high school seniors often ride on their SAT scores. The SAT consists of three sections: a math section, a writing section, and a critical reading section, with the scores for each section ranging from a minimum of 200 to a maximum of 800 and going up in increments of 10 points. In 2007, there were 1,494,531 college-bound seniors who took the SAT. How do we describe the math section results for this group of students? Example 14.62007 SAT Math Scores

Excursions in Modern Mathematics, 7e: 14.2 - 19Copyright © 2010 Pearson Education, Inc. We could set up a frequency table (or a bar graph) with the number of students scoring each of the possible scores–200, 210, 220, 790, 800. The problem is that there are 61 different possible scores between 200 and 800, and this number is too large for an effective bar graph. Example 14.62007 SAT Math Scores

Excursions in Modern Mathematics, 7e: 14.2 - 20Copyright © 2010 Pearson Education, Inc. In situations such as this one it is customary to present a more compact picture of the data by grouping together, or aggregating, sets of scores into categories called class intervals. The decision as to how the class intervals are defined and how many there are will depend on how much or how little detail is desired, but as a general rule of thumb, the number of class intervals should be somewhere between 5 and 20. Example 14.62007 SAT Math Scores

Excursions in Modern Mathematics, 7e: 14.2 - 21Copyright © 2010 Pearson Education, Inc. SAT scores are usually aggregated into 12 class intervals of essentially the same size: 200–249, 250–299, 300–349, 700–749, 750–800. Example 14.62007 SAT Math Scores

Excursions in Modern Mathematics, 7e: 14.2 - 23Copyright © 2010 Pearson Education, Inc. The process of converting test scores (a numerical variable) into grades (a categorical variable) requires setting up class intervals for the various letter grades. Typically, the professor has the latitude to decide how to do this. One standard approach is to use an absolute grading scale, usually with class intervals of (almost) equal length for all grades except F. (e.g., A = 90-100%, B = 80- 89%, C = 70-79%, D = 60-69%, F = 0-59%). Example 14.7Stat 101 Test Scores: Part 3

Excursions in Modern Mathematics, 7e: 14.2 - 24Copyright © 2010 Pearson Education, Inc. Another frequently used approach is to use a relative grading scale. Here the professor fits the class intervals for the grades to the performance of the class in the test, often using class intervals of varying lengths. Some people call this “grading on the curve,” although this terminology is somewhat misused. To illustrate relative grading in action, let’s revisit the Stat 101 midterm scores discussed in Example 14.1. Example 14.7Stat 101 Test Scores: Part 3

Excursions in Modern Mathematics, 7e: 14.2 - 25Copyright © 2010 Pearson Education, Inc. After looking at the overall class performance, Dr. Blackbeard chooses to “curve” the test scores using class intervals of his own creation. Example 14.7Stat 101 Test Scores: Part 3

Excursions in Modern Mathematics, 7e: 14.2 - 26Copyright © 2010 Pearson Education, Inc. The grade distribution in the Stat 101 midterm can now be best seen by means of a bar graph. The picture speaks for itself–this was a very tough exam! Example 14.7Stat 101 Test Scores: Part 3

Excursions in Modern Mathematics, 7e: 14.2 - 27Copyright © 2010 Pearson Education, Inc. When a numerical variable is continuous, its possible values can vary by infinitesimally small increments. As a consequence, there are no gaps between the class intervals, and our old way of doing things (using separated columns or stacks) will no longer work. In this case we use a variation of a bar graph called a histogram. Capture-Recapture Method

Excursions in Modern Mathematics, 7e: 14.2 - 28Copyright © 2010 Pearson Education, Inc. Suppose we want to use a graph to display the distribution of starting salaries for last year’s graduating class at Tasmania State University. The starting salaries of the N = 3258 graduates range from a low of $40,350 to a high of $74,800. Based on this range and the amount of detail we want to show, we must decide on the length of the class intervals. A reasonable choice would be to use class intervals defined in increments of $5000. Example 14.8Starting Salaries of TSU Graduates

Excursions in Modern Mathematics, 7e: 14.2 - 30Copyright © 2010 Pearson Education, Inc. Here is the histogram showing the relative frequency of each class interval. As we can see, a histogram is very similar to a bar graph. Example 14.8Starting Salaries of TSU Graduates

Excursions in Modern Mathematics, 7e: 14.2 - 31Copyright © 2010 Pearson Education, Inc. Several important distinctions must be made, however. To begin with, because a histogram is used for continuous variables, there can be no gaps between the class intervals, and it follows, therefore, that the columns of a histogram must touch each other. Among other things, this forces us to make an arbitrary decision as to what happens to a value that falls exactly on the boundary between two class intervals. Example 14.8Starting Salaries of TSU Graduates

Excursions in Modern Mathematics, 7e: 14.2 - 32Copyright © 2010 Pearson Education, Inc. Should it always belong to the class interval to the left or to the one to the right? This is called the endpoint convention. The superscript “plus” marks in Table 14-6indicate how we chose to deal with the endpoint convention in Fig. 14-11. A starting salary of exactly $50,000, for example, would be listed under the 45,000 + –50,000 class interval rather than the 50,000 + –55,000 class interval. Example 14.8Starting Salaries of TSU Graduates

Excursions in Modern Mathematics, 7e: 14.2 - 33Copyright © 2010 Pearson Education, Inc. When creating histograms, we should try, as much as possible, to define class intervals of equal length. When the class intervals are of unequal length, the rules for creating a histogram are considerably more complicated, since it is no longer appropriate to use the heights of the columns to indicate the frequencies of the class intervals. Use Class Intervals of Equal Length

Excursions in Modern Mathematics, 7e: 14.2 - 2Copyright © 2010 Pearson Education, Inc. 14 Descriptive Statistics 14.1Graphical Descriptions of Data 14.2Variables.

Similar presentations

Presentation on theme: "Excursions in Modern Mathematics, 7e: 14.2 - 2Copyright © 2010 Pearson Education, Inc. 14 Descriptive Statistics 14.1Graphical Descriptions of Data 14.2Variables."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Excursions in Modern Mathematics, 7e: 14.2 - 2Copyright © 2010 Pearson Education, Inc. 14 Descriptive Statistics 14.1Graphical Descriptions of Data 14.2Variables.

Similar presentations

Presentation on theme: "Excursions in Modern Mathematics, 7e: 14.2 - 2Copyright © 2010 Pearson Education, Inc. 14 Descriptive Statistics 14.1Graphical Descriptions of Data 14.2Variables."— Presentation transcript:

Similar presentations

About project

Feedback