# 8- 1 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. l l p p g g n n i i m m a a S S Methods & & Central Limit Theorem.

## Presentation on theme: "8- 1 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. l l p p g g n n i i m m a a S S Methods & & Central Limit Theorem."— Presentation transcript:

8- 1 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. l l p p g g n n i i m m a a S S Methods & & Central Limit Theorem

8- 2 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. When you have completed this chapter, you will be able to: Describe methods for selecting a sample. Define and construct a sampling distribution of the sample mean. 1. 2. 3. Explain under what conditions sampling is the proper way to learn something about a population. Explain the central limit theorem. 4. Use the central limit theorem to find probabilities of selecting possible sample means from a specified population. 5.

8- 3 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. We use sample information to make decisions or inferences about the population. Two KEY steps: 1.Choice of a proper method for selecting sample data & 2. Proper analysis of the sample data (more later) KEY 1.

8- 5 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. KEY 1. If the proper method for selecting the sample is NOT MADE … the SAMPLE will not be truly representative of the TOTAL Population! … and wrong conclusions can be drawn!

8- 6 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. …of the physical impossibility of checking all items in the population, and, also, it would be too time-consuming \$ …the studying of all the items in a population would NOT be cost effective …the sample results are usually adequate …the destructive nature of certain tests Why Sample the Population? Because…

8- 7 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. with Replacement Each data unit in the population is allowed to appear in the sample more than once Each data unit in the population is allowed to appear in the sample no more than once Each data unit in the population has a known likelihood of being included in the sample Non-Probability Sampling Does not involve random selection; inclusion of an item is based on convenience P robability S ampling without Replacement T echniques

8- 8 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. M ethods Simple Random Systematic Random Stratified Random Cluster...each item(person) in the population has an equal chance of being included …items(people) of the population are arranged in some order. A random starting point is selected, and then every k th member of the population is selected for the sample …a population is first divided into subgroups, called strata, and a sample is selected from each strata …a population is first divided into primary units, and samples are selected from each unit

8- 9 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. T erminology … is the difference between a sample statistic and its corresponding population parameter … is a probability distribution consisting of all possible sample means of a given sample size selected from a population … is a probability distribution consisting of all possible sample means of a given sample size selected from a population Sampling error Sampling distribution of the sample mean Example

8- 10 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. The law firm of Hoya and Associates has five partners. At their weekly partners meeting each reported the number of hours they billed their clients last week: Example PartnerHours Dunn22 Hardy26 Kiers30 Malinowski26 Tillman22 If two partners are selected randomly… how many different samples are possible? If two partners are selected randomly… how many different samples are possible?

8- 11 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. PartnerHours Dunn22 Hardy26 Kiers30 Malinowski26 Tillman22 Objects 5 …taken 2 at a time Using 5 C 2 … …for a Total of 10 Samples! If two partners are selected randomly… how many different samples are possible? If two partners are selected randomly… how many different samples are possible?

8- 12 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. PartnerHours Dunn22 Hardy26 Kiers30 Malinowski26 Tillman22 Objects 5 5 C 2 = 5! = 2! = 10 Samples (5 – 2!) If two partners are selected randomly… how many different samples are possible? If two partners are selected randomly… how many different samples are possible?

8- 13 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. PartnersSamples of 2Mean 1&2 1&3 1&4 1&5 2&3 2&4 2&5 3&4 3&5 4&5 (22+26)/2 = (22+30)/2 = (22+26)/2 = 24 26 24 (22+22)/2 = (26+30)/2 = (26+26)/2 = (26+22)/2 = (30+26)/2 = (30+22)/2 = (26+22)/2 = 22 28 26 24 28 26 24

8- 14 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Sample Mean FrequencyRelative frequency Probability Organize the sample means into a Sampling Distribution Example …continued Mean 24 26 24 22 28 26 24 28 26 24 22 24 26 28 1 4 3 2 1/10 4/10 3/10 2/10 10 Samples

8- 15 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Sample MeanFrequency 10 22(1)+ 24(4)+ 26(3) + 28(2) Example …continued 22 24 26 28 1 4 3 2 Compute the mean of the sample means. Compare it with the population mean = 25.2 X

8- 16 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Example …continued 5 2226302622 The population mean is also the same as the sample means…25.2 hours! The population mean is also the same as the sample means…25.2 hours! Note PartnerHours Dunn22 Hardy26 Kiers30 Malinowski26 Tillman22 = 25.2

8- 17 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. The sampling distribution of the means of all possible samples of size n generated from the population will be approximately normally distributed! C entral L imit T heorem Sampling Distributions: µ µ Variance 2 /n Mean (µ x ) / n Standard Deviation (standard error of the mean) X

8- 18 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. s ample mean s ample standard deviation s ample variance s ample proportion A point estimate is one value ( a single point) that is used to estimate a population parameter P oint E stimates More

8- 19 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. P oint E stimates P opulation follows… the normal distribution The sampling distribution of the sample means also follows the normal distribution Probability of a sample mean falling within a particular region, use: Z = n X P opulation does NOT follow… the normal distribution If the sample is of at least 30 observations, the sample WILL follow the normal distribution Probability of a sample mean falling within a particular region, use: Z = n X s

8- 25 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Using See If you want whole numbers, use the FUNCTION WIZARD (f x ) to ROUND to the nearest integer.

8- 31 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Using Input Data in Column A Select See Scroll to…Sampling Click OK Scroll to…Sampling Click OK Select

8- 33 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Using Since this is random number generation, you will get different numbers each time you do this…

8- 34 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Suppose it takes an average of 330 minutes for taxpayers to prepare, copy, and mail an income tax return form. Using the Sampling Distribution of the Sample Mean = 12.6 = 12.6 A consumer watchdog agency selects a random sample of 40 taxpayers and finds the standard deviation of the time needed is 80 minutes What is the standard error of the mean? Data… / n Formula = 80 / 40

8- 35 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. What is the likelihood the sample mean is greater than 320 minutes? Using the Sampling Distribution of the Sample Mean Suppose it takes an average of 330 minutes for taxpayers to prepare, copy, and mail an income tax return form. A consumer watchdog agency selects a random sample of 40 taxpayers and finds the standard deviation of the time needed is 80 minutes. Data… nswer…

8- 36 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Using the Sampling Distribution of the Sample Mean What is the likelihood the sample mean is greater than 320 minutes? * average of 330 minutes *random sample of 40 * standard deviation is 80 minutes * average of 330 minutes *random sample of 40 * standard deviation is 80 minutes Data… ns X z Formula 4080 330320 = 0.79 1 330 320 a 1

8- 37 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Using the Sampling Distribution of the Sample Mean What is the likelihood the sample mean is greater than 320 minutes? * average of 330 minutes *random sample of 40 * standard deviation is 80 minutes * average of 330 minutes *random sample of 40 * standard deviation is 80 minutes Data… Look up 0.79 in Table 2 a 1 =0.2852 Required Area = 0.2852 +.5 = 0.7852 Required Area = 0.2852 +.5 = 0.7852 330 320 a 1

8- 38 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Sampling Distribution of Proportion The normal distribution (a continuous distribution ) yields a good approximation of the binomial distribution (a discrete distribution) for large values of n. Use when np and n(1- p ) are both greater than 5!

8- 39 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. npnp )1 ( p np Mean and Variance of a Binomial Probability Distribution Mean and Variance of a Binomial Probability Distribution 2 Formula 2 Formula

8- 40 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. A multinational company claims that 55% of its employees are bilingual. To verify this claim, a statistician selected a sample of 60 employees of the company using simple random sampling and found 48% to be bilingual. np = 60(.55) = 33 n(1- p ) = 60(.45) = 27 The sample size is big enough to use the normal approximation with a mean of.55 and a standard deviation of (.55)(.45)/60 = 0.064 Sampling Distribution of Proportion Based on this information, what can we say about the companys claim?

8- 41 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. s X z 1 Z = (0.48 -0.55) / 0.064 Z = -1.09 Look up 1.09 in Table 2 2 a 1 =0.3621 Required Area =.5 – 0.3621 = 0.1379 or 14% Sampling Distribution of Proportion …continued Formula.55.48 a 1

8- 42 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. s X z 1 Z = (0.48 -0.55) / 0.064 Z = -1.09 Look up 1.09 in Table 2 2 a =0.3621 Required Area =.5 – 0.3621 = 0.1379 or 14% There is approximately a 14% chance that the companys claim is true, based on this sample. Sampling Distribution of Proportion Conclusion …continued Formula

8- 43 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Suppose the mean selling price of a litre of gasoline in Canada is \$.659. Further, assume the distribution is positively skewed, with a standard deviation of \$0.08. What is the probability of selecting a sample of 35 gasoline stations and finding the sample mean within \$.03 of the population mean? Sampling Distribution of Mean

8- 44 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Sampling Distribution of Mean ns z X 1 3508.0\$ 659\$.629\$. 22.-2 ns z X 2 3508.0\$ 659\$.689\$. 2.22 mean selling price is \$.659 SD of \$0.08 Sample of 35 gasoline stations Probability of sample mean within \$.03? mean selling price is \$.659 SD of \$0.08 Sample of 35 gasoline stations Probability of sample mean within \$.03? Data… Find the z-scores for.659 +/-.03 Find the z-scores for.659 +/-.03 i.e. 0.629 and.689.629.689

8- 45 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. We would expect about 97% of the sample means to be within \$0.03 of the population mean. a 1 =.4868 a 2 =.4868 Sampling Distribution of Mean mean selling price is \$.659 SD of \$0.08 Sample of 35 gasoline stations Probability of sample mean within \$.03? mean selling price is \$.659 SD of \$0.08 Sample of 35 gasoline stations Probability of sample mean within \$.03? Data… Find areas from table… Required A =.9736 z -2.22 1 z 2.22 2