# The arithmetic mean of a variable is computed by determining the sum of all the values of the variable in the data set divided by the number of observations.

## Presentation on theme: "The arithmetic mean of a variable is computed by determining the sum of all the values of the variable in the data set divided by the number of observations."— Presentation transcript:

The arithmetic mean of a variable is computed by determining the sum of all the values of the variable in the data set divided by the number of observations. 3-1© 2010 Pearson Prentice Hall. All rights reserved

The population arithmetic mean is computed using all the individuals in a population. The population mean is a parameter. The population arithmetic mean is denoted by. 3-2© 2010 Pearson Prentice Hall. All rights reserved

The sample arithmetic mean is computed using sample data. The sample mean is a statistic. The sample arithmetic mean is denoted by. 3-3© 2010 Pearson Prentice Hall. All rights reserved

If x 1, x 2, …, x N are the N observations of a variable from a population, then the population mean, µ, is 3-4© 2010 Pearson Prentice Hall. All rights reserved

If x 1, x 2, …, x n are the n observations of a variable from a sample, then the sample mean,, is 3-5© 2010 Pearson Prentice Hall. All rights reserved

The median of a variable is the value that lies in the middle of the data when arranged in ascending order. We use M to represent the median. 3-6© 2010 Pearson Prentice Hall. All rights reserved

A numerical summary of data is said to be resistant if extreme values (very large or small) relative to the data do not affect its value substantially. 3-8© 2010 Pearson Prentice Hall. All rights reserved

The mode of a variable is the most frequent observation of the variable that occurs in the data set. If there is no observation that occurs with the most frequency, we say the data has no mode. 3-10© 2010 Pearson Prentice Hall. All rights reserved

Tally data to determine most frequent observation 3-11© 2010 Pearson Prentice Hall. All rights reserved

The range, R, of a variable is the difference between the largest data value and the smallest data values. That is Range = R = Largest Data Value – Smallest Data Value 3-12© 2010 Pearson Prentice Hall. All rights reserved

The population variance of a variable is the sum of squared deviations about the population mean divided by the number of observations in the population, N. That is it is the mean of the sum of the squared deviations about the population mean. 3-13© 2010 Pearson Prentice Hall. All rights reserved

The population variance is symbolically represented by σ 2 (lower case Greek sigma squared). Note: When using the above formula, do not round until the last computation. Use as many decimals as allowed by your calculator in order to avoid round off errors. 3-14© 2010 Pearson Prentice Hall. All rights reserved

The sample variance is computed by determining the sum of squared deviations about the sample mean and then dividing this result by n – 1. 3-16© 2010 Pearson Prentice Hall. All rights reserved

Note: Whenever a statistic consistently overestimates or underestimates a parameter, it is called biased. To obtain an unbiased estimate of the population variance, we divide the sum of the squared deviations about the mean by n - 1. 3-17© 2010 Pearson Prentice Hall. All rights reserved

The population standard deviation is denoted by It is obtained by taking the square root of the population variance, so that The sample standard deviation is denoted by s It is obtained by taking the square root of the sample variance, so that 3-18© 2010 Pearson Prentice Hall. All rights reserved

(a) Compute the population mean and standard deviation. (b) Draw a histogram to verify the data is bell-shaped. (c) Determine the percentage of patients that have serum HDL within 3 standard deviations of the mean according to the Empirical Rule. (d) Determine the percentage of patients that have serum HDL between 34 and 69.1 according to the Empirical Rule. (e) Determine the actual percentage of patients that have serum HDL between 34 and 69.1. 3-21© 2010 Pearson Prentice Hall. All rights reserved

(a) Using a TI-83 plus graphing calculator, we find (b) 3-22© 2010 Pearson Prentice Hall. All rights reserved

22.3 34.0 45.7 57.4 69.1 80.8 92.5 (e) 45 out of the 54 or 83.3% of the patients have a serum HDL between 34.0 and 69.1. (c) According to the Empirical Rule, 99.7% of the patients that have serum HDL within 3 standard deviations of the mean. (d) 13.5% + 34% + 34% = 81.5% of patients will have a serum HDL between 34.0 and 69.1 according to the Empirical Rule. 3-23© 2010 Pearson Prentice Hall. All rights reserved

The kth percentile, denoted, P k, of a set of data is a value such that k percent of the observations are less than or equal to the value. 3-29© 2010 Pearson Prentice Hall. All rights reserved

Quartiles divide data sets into fourths, or four equal parts. The 1 st quartile, denoted Q 1, divides the bottom 25% the data from the top 75%. Therefore, the 1 st quartile is equivalent to the 25 th percentile. The 2 nd quartile divides the bottom 50% of the data from the top 50% of the data, so that the 2 nd quartile is equivalent to the 50 th percentile, which is equivalent to the median. The 3 rd quartile divides the bottom 75% of the data from the top 25% of the data, so that the 3 rd quartile is equivalent to the 75 th percentile. 3-30© 2010 Pearson Prentice Hall. All rights reserved