The Mean A more compact notation for the Mean is: The Σ (capital Greek letter sigma) in the formula for the mean is short for sum. The subscripts on the observations x i are just a way of keeping the n observations distinct. They do not necessarily indicate order or any other special facts about the data.
Example of Calculating the Mean EX: Refer to the data on travel times to work for the sample of 15 North Carolinians. (a) Find the mean travel time for all 15 workers. (b) Calculate the mean again, this time excluding the person who reported a 60-minute travel time to work. What do you notice?
Things to think about… The previous example illustrates an important weakness of the mean as a measure of center which is that the mean is sensitive to the influence of extreme observations. These may be outliers, but a skewed distribution that has no outliers will also pull the mean toward its long tail. Because the mean cannot resist the influence of extreme observations, therefore we say that it is not a Resistant measure of center.
Measuring the Center The Median The median ‘M’ is the midpoint of a distribution – the number so that half of the observations are smaller and the other half are larger. To find the Median: 1.Arrange the data in ascending order (from smallest to largest) 2.If the number of observations is odd, then the median is the center observation in the ordered list. 3.If the number of observations is even, the Median is the average of the two center observations in the ordered list.
Example of Finding the Median EX: People say that it takes a long time to get to work in New York State due to the heavy traffic near big cities. What do the data say? Here are the travel times in minutes of 20 randomly chosen New York workers: 10 30 5 25 40 20 10 15 30 20 15 20 85 15 65 15 60 60 40 45 a) Find the median of the travel times. Show your work.
Comparing Mean and Median The mean and median of a roughly symmetric distribution are close together. If the distribution is exactly symmetric, the mean and median are exactly the same. In a skewed distribution, the mean is usually farther towards the long tail than the median. Also the Mean is not resistant (to outliers and skewed distributions) while the Median is.
Example of finding IQR EX: Find and interpret the IQR of the following data on reported travel times to work for 20 randomly selected New Yorkers. Notes: if the number of observations is odd, you do not include the Median when calculating the Quartiles. If the number if observations is even you include all numbers.
Identifying Outliers An observation is an outlier if it falls more than 1.5 x IQR above the 3 rd quartile or below the 1 st quartile ABOVE: Q3 + (1.5 x IQR) OR BELOW: Q1 – (1.5 x IQR) EX: Does the 1.5 × IQR rule identify any outliers for the New York travel time data? In the previous example, we found that Q 1 = 15 minutes, Q 3 = 42.5 minutes, and IQR = 27.5 minutes.
Five-Number Summary The five number summary for observations includes: minimum, Q1, Median, Q3, and maximum The five number summary of a distribution leads us to a new type of graph….THE BOXPLOT (sometimes called a box and whisker plot) A central box is drawn from the first quartile (Q 1 ) to the third quartile (Q 3 ). A line in the box marks the median. Lines (called whiskers) extend from the box out to the smallest and largest observations that are not outliers.
EX: The 2009 roster of the Dallas Cowboys professional football team included 10 offensive linemen. Their weights (in pounds) were 338 318 353 313 318 326 307 317 311 311 1.Find the five-number summary for these data by hand. Show your work. 2.Calculate the IQR. 3.Determine if there are any outliers. 4.Draw a boxplot of the data. Now lets use our calculators to construct the boxplot for this data!!
Standard Deviation More important than the details of calculating the standard deviation are the properties that determine the usefulness of the standard deviation: – s x measures spread about the mean and should be used only when the mean is chosen as the measure of center. – s x is always greater than or equal to 0. s x = 0 only when there is no variability. This happens only when all observations have the same value. – s x has the same units of measurement as the original observations. – Like the mean, s x is not resistant. A few outliers can make s x very large.
EX: The heights (in inches)of the five starters on a basketball team are 67, 72, 76, 76, and 84 a)Find and interpret the Mean. b)Use your calculator to calculate the standard deviation. c)Interpret the meaning of the standard deviation in this setting.
Numerical Summaries using Calculator Enter the following data for travel times to work in List1 and List2 on your calculator: N.C. (L1)NY (L2) Record the descriptive statistics for each set of data: