Presentation is loading. Please wait.

Presentation is loading. Please wait.

3.1 Measures of Central Tendency. Ch. 3 Numerically Summarizing Data The arithmetic mean of a variable is computed by determining the sum of all the values.

Similar presentations


Presentation on theme: "3.1 Measures of Central Tendency. Ch. 3 Numerically Summarizing Data The arithmetic mean of a variable is computed by determining the sum of all the values."— Presentation transcript:

1 3.1 Measures of Central Tendency

2 Ch. 3 Numerically Summarizing Data The arithmetic mean of a variable is computed by determining the sum of all the values of the variable in the data set divided by the number of observations. The population arithmetic mean is computed using all the individuals in a population. –The population mean is a parameter. –The population arithmetic mean is denoted by the symbol μ

3 Population Mean If x 1, x 2, …, x N are the N observations of a variable from a population, then the population mean, µ, is

4 Sample Mean The sample arithmetic mean is computed using sample data. The sample mean is a statistic. The sample arithmetic mean is denoted by

5 Sample Mean If x 1, x 2, …, x N are the N observations of a variable from a sample, then the sample mean is

6 Sample Problem Computing a Population Mean and a Sample Mean The following data represent the travel times (in minutes) to work for all seven employees of a start-up web development company. 23, 36, 23, 18, 5, 26, 43 a. Compute the population mean of this data. b. Then take a simple random sample of n = 3 employees. Compute the sample mean. Obtain a second simple random sample of n = 3 employees. Again compute the sample mean.

7 EXAMPLEComputing a Population Mean and a Sample Mean (b) Obtain a simple random sample of size n = 3 from the population of seven employees. Use this simple random sample to determine a sample mean. Find a second simple random sample and determine the sample mean. 1 2 3 4 5 6 7 23, 36, 23, 18, 5, 26, 43 3-7© 2010 Pearson Prentice Hall. All rights reserved

8 Median The median of a variable is the value that lies in the middle of the data when arranged in ascending order. We use M to represent the median.

9 3-9

10 EXAMPLEComputing a Median of a Data Set with an Odd Number of Observations The following data represent the travel times (in minutes) to work for all seven employees of a start-up web development company. 23, 36, 23, 18, 5, 26, 43 Determine the median of this data.

11 EXAMPLEComputing a Median of a Data Set with an Even Number of Observations Suppose the start-up company hires a new employee. The travel time of the new employee is 70 minutes. Determine the mean and median of the “new” data set. 23, 36, 23, 18, 5, 26, 43, 70 3-11

12 EXAMPLEComputing a Median of a Data Set with an Even Number of Observations The following data represent the travel times (in minutes) to work for all seven employees of a start-up web development company. 23, 36, 23, 18, 5, 26, 43 Suppose a new employee is hired who has a 130 minute commute. How does this impact the value of the mean and median? 3-12

13 EXAMPLEComputing a Median of a Data Set with an Even Number of Observations The following data represent the travel times (in minutes) to work for all seven employees of a start-up web development company. 23, 36, 23, 18, 5, 26, 43 Suppose a new employee is hired who has a 130 minute commute. How does this impact the value of the mean and median? Mean before new hire: 24.9 minutes Median before new hire: 23 minutes Mean after new hire: 38 minutes Median after new hire: 24.5 minutes 3-13

14 Resistance A numerical summary of data is said to be resistant if extreme values (very large or small) relative to the data do not affect its value substantially.

15 3-15

16 EXAMPLE Describing the Shape of the Distribution The following data represent the asking price of homes for sale in Lincoln, NE. Source: http://www.homeseekers.com 79,995128,950149,900189,900 99,899130,950151,350203,950 105,200131,800154,900217,500 111,000132,300159,900260,000 120,000134,950163,300284,900 121,700135,500165,000299,900 125,950138,500174,850309,900 126,900147,500180,000349,900 3-16

17 Sample Problem Find the mean and median. Use the mean and median to identify the shape of the distribution. Verify your result by drawing a histogram of the data.

18 One-Variable Statistics Nspire 1.Create a list & spreadsheets page 2.Title column 3.Enter data into column 4.Create a calculator page 5.Click Menu 6.6:Statistics –1:Stat Calculations –1: One-Variable Stats

19 Find the mean and median. Use the mean and median to identify the shape of the distribution. Verify your result by drawing a histogram of the data. The mean asking price is $168,320 and the median asking price is $148,700. Therefore, we would conjecture that the distribution is skewed right. 3-19

20 3-20

21 Mode The mode of a variable is the most frequent observation of the variable that occurs in the data set. If there is no observation that occurs with the most frequency, we say the data has no mode. –The data on the next slide represent the Vice Presidents of the United States and their state of birth. Find the mode.

22 3-22

23 -23

24 Tally data to determine most frequent observation 3-24

25 To order food at a McDonald’s Restaurant, one must choose from multiple lines, while at Wendy’s Restaurant, one enters a single line. The following data represent the wait time (in minutes) in line for a simple random sample of 30 customers at each restaurant during the lunch hour. For each sample, answer the following: (a) What was the mean wait time? (b) Draw a histogram of each restaurant’s wait time. (c ) Which restaurant’s wait time appears more dispersed? Which line would you prefer to wait in? Why? 3-25 Sample Problem

26 1.500.791.011.660.940.67 2.531.201.460.890.950.90 1.882.941.401.331.200.84 3.991.901.001.540.990.35 0.901.230.921.091.722.00 3.500.000.380.431.823.04 0.000.260.140.602.332.54 1.970.712.224.540.800.50 0.000.280.441.380.921.17 3.082.750.363.102.190.23 Wait Time at Wendy’s Wait Time at McDonald’s 3-

27 (b) The mean wait time in each line is 1.39 minutes.

28 3.2 Measures of Dispersion

29 Range The range, R, of a variable is the difference between the largest data value and the smallest data values. That is Range = R = Largest Data Value – Smallest Data Value

30 EXAMPLEFinding the Range of a Set of Data The following data represent the travel times (in minutes) to work for all seven employees of a start-up web development company. 23, 36, 23, 18, 5, 26, 43 Find the range.

31 Population Variance The population variance of a variable is the sum of squared deviations about the population mean divided by the number of observations in the population, N. That is it is the mean of the sum of the squared deviations about the population mean.

32 The population variance is symbolically represented by σ 2 (lower case Greek sigma squared). Note: When using the above formula, do not round until the last computation. Use as many decimals as allowed by your calculator in order to avoid round off errors. 3-32

33 EXAMPLE Computing a Population Variance The following data represent the travel times (in minutes) to work for all seven employees of a start-up web development company. 23, 36, 23, 18, 5, 26, 43 Compute the population variance of this data. Recall that

34 xixi μ x i – μ(x i – μ) 2 2324.85714-1.857143.44898 3624.8571411.14286124.1633 2324.85714-1.857143.44898 1824.85714-6.8571447.02041 524.85714-19.8571394.3061 2624.857141.1428571.306122 4324.8571418.14286329.1633 902.8571 minutes 2 -34

35 Sample Variance The sample variance is computed by determining the sum of squared deviations about the sample mean and then dividing this result by n – 1.

36 Note: Whenever a statistic consistently overestimates or underestimates a parameter, it is called biased. To obtain an unbiased estimate of the population variance, we divide the sum of the squared deviations about the mean by n - 1. 36

37 The population standard deviation is denoted by It is obtained by taking the square root of the population variance, so that The sample standard deviation is denoted by s It is obtained by taking the square root of the sample variance, so that 3-37

38 EXAMPLE Computing a Population Standard Deviation The following data represent the travel times (in minutes) to work for all seven employees of a start-up web development company. 23, 36, 23, 18, 5, 26, 43 Compute the population standard deviation and variance of this data using technology. 3-38

39 One-Variable Statistics Nspire 1.Create a list & spreadsheets page 2.Title column 3.Enter data into column 4.Create a calculator page 5.Click Menu 6.6:Statistics –1:Stat Calculations –1: One-Variable Stats

40 3-40

41 3-41

42 EXAMPLE Using the Empirical Rule The following data represent the serum HDL cholesterol of the 54 female patients of a family doctor. 414843383537444444 627577588239855554 676969706572747474 606060616263646464 545455565656575859 454747484850525253 3-42

43 (a) Compute the population mean and standard deviation. (b) Draw a histogram to verify the data is bell- shaped. (c) Determine the percentage of patients that have serum HDL within 3 standard deviations of the mean according to the Empirical Rule. (d) Determine the percentage of patients that have serum HDL between 34 and 69.1 according to the Empirical Rule. (e) Determine the actual percentage of patients that have serum HDL between 34 and 69.1. 3-43

44 (a) Using a TI-nspire graphing calculator, we find (b) 3-44© 2010 Pearson Prentice Hall. All rights reserved

45 (c) Determine the percentage of patients that have serum HDL within 3 standard deviations of the mean according to the Empirical Rule. (d) Determine the percentage of patients that have serum HDL between 34 and 69.1 according to the Empirical Rule. (e) Determine the actual percentage of patients that have serum HDL between 34 and 69.1. (look back at original data!)

46 22.3 34.0 45.7 57.4 69.1 80.8 92.5 (e) 45 out of the 54 or 83.3% of the patients have a serum HDL between 34.0 and 69.1. (c) According to the Empirical Rule, 99.7% of the patients that have serum HDL within 3 standard deviations of the mean. (d) 13.5% + 34% + 34% = 81.5% of patients will have a serum HDL between 34.0 and 69.1 according to the Empirical Rule.

47 One measure of intelligence is the Stanford-Binet Intelligence Quotient (IQ). IQ scores have bell-shaped distribution with a mean of 100 and a standard deviation of 15 A.What percentage of people has an IQ score between 70 and 130? B.What percentage of people has an IQ score less than 70 or greater than 130? C.What percentage of people has an IQ score below 85?


Download ppt "3.1 Measures of Central Tendency. Ch. 3 Numerically Summarizing Data The arithmetic mean of a variable is computed by determining the sum of all the values."

Similar presentations


Ads by Google