Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 1 Slide Sampling and Sampling Distributions Sampling Distribution of Sampling Distribution of Introduction to Sampling Distributions Introduction to.

Similar presentations


Presentation on theme: "1 1 Slide Sampling and Sampling Distributions Sampling Distribution of Sampling Distribution of Introduction to Sampling Distributions Introduction to."— Presentation transcript:

1 1 1 Slide Sampling and Sampling Distributions Sampling Distribution of Sampling Distribution of Introduction to Sampling Distributions Introduction to Sampling Distributions Point Estimation Point Estimation Selecting a Sample Selecting a Sample Other Sampling Methods Other Sampling Methods

2 2 2 Slide Introduction A population is the set of all the elements of interest. A population is the set of all the elements of interest. A sample is a subset of the population. A sample is a subset of the population. An element is the entity on which data are collected. An element is the entity on which data are collected. The reason we select a sample is to collect data to The reason we select a sample is to collect data to answer a research question about a population. answer a research question about a population. The reason we select a sample is to collect data to The reason we select a sample is to collect data to answer a research question about a population. answer a research question about a population. A frame is a list of the elements that the sample will A frame is a list of the elements that the sample will be selected from. be selected from. A frame is a list of the elements that the sample will A frame is a list of the elements that the sample will be selected from. be selected from.

3 3 3 Slide The sample results provide only estimates of the The sample results provide only estimates of the values of the population characteristics. values of the population characteristics. The sample results provide only estimates of the The sample results provide only estimates of the values of the population characteristics. values of the population characteristics. With proper sampling methods, the sample results With proper sampling methods, the sample results can provide “good” estimates of the population can provide “good” estimates of the population characteristics. characteristics. With proper sampling methods, the sample results With proper sampling methods, the sample results can provide “good” estimates of the population can provide “good” estimates of the population characteristics. characteristics. Introduction The reason is simply that the sample contains only a The reason is simply that the sample contains only a portion of the population. portion of the population. The reason is simply that the sample contains only a The reason is simply that the sample contains only a portion of the population. portion of the population. This chapter provides the basis for determining how This chapter provides the basis for determining how large the estimation error might be. large the estimation error might be. This chapter provides the basis for determining how This chapter provides the basis for determining how large the estimation error might be. large the estimation error might be.

4 4 4 Slide Selecting a Sample n Sampling from a Finite Population n Sampling from a Process

5 5 5 Slide Sampling from a Finite Population n Finite populations are often defined by lists such as: Organization membership roster Organization membership roster Credit card account numbers Credit card account numbers Inventory product numbers Inventory product numbers n A simple random sample of size n from a finite population of size N is a sample selected such that each population of size N is a sample selected such that each possible sample of size n has the same probability of possible sample of size n has the same probability of being selected. being selected.

6 6 6 Slide In large sampling projects, computer-generated In large sampling projects, computer-generated random numbers are often used to automate the random numbers are often used to automate the sample selection process. sample selection process. Sampling without replacement is the procedure Sampling without replacement is the procedure used most often. used most often. Replacing each sampled element before selecting Replacing each sampled element before selecting subsequent elements is called sampling with subsequent elements is called sampling with replacement. replacement. Sampling from a Finite Population

7 7 7 Slide St. Andrew’s College received 900 applications for St. Andrew’s College received 900 applications for admission in the upcoming year from prospective students. The applicants were numbered, from 1 to 900, as their applications arrived. The Director of Admissions would like to select a simple random sample of 30 applicants. n Example: St. Andrew’s College Sampling from a Finite Population

8 8 8 Slide The random numbers generated by Excel’s The random numbers generated by Excel’s RAND function follow a uniform probability RAND function follow a uniform probability distribution between 0 and 1. distribution between 0 and 1. The random numbers generated by Excel’s The random numbers generated by Excel’s RAND function follow a uniform probability RAND function follow a uniform probability distribution between 0 and 1. distribution between 0 and 1. Step 1: Assign a random number to each of the 900 applicants. applicants. Step 2: Select the 30 applicants corresponding to the 30 smallest random numbers. 30 smallest random numbers. Sampling from a Finite Population Using Excel n Example: St. Andrew’s College

9 9 9 Slide n Excel Formula Worksheet Note: Rows 10-901 are not shown. Sampling from a Finite Population Using Excel AB 1 Applicant Number 21=RAND() 32 43 54 65 76 87 98 Random Number

10 10 Slide n Excel Value Worksheet Note: Rows 10-901 are not shown. Sampling from a Finite Population Using Excel AB 1 Applicant Number 21 32 43 54 65 76 87 98 Random Number 0.61021 0.83762 0.58935 0.19934 0.86658 0.60579 0.80960 0.33224

11 11 Slide n Put Random Numbers in Ascending Order Step 4 Choose Sort Smallest to Largest Step 3 In the Editing group, click Sort & Filter Step 2 Click the Home tab on the Ribbon Step 1 Select any cell in the range B2:B901 Sampling from a Finite Population Using Excel

12 12 Slide n Excel Value Worksheet (Sorted) Note: Rows 10-901 are not shown. Sampling from a Finite Population Using Excel AB 1 Applicant Number 2 3 4 5 6 7 8 9 Random Number 0.00027 0.00192 0.00303 0.00481 0.00538 0.00583 0.00649 0.00667 12 773 408 58 116 185 510 394

13 13 Slide n Populations are often defined by an ongoing process whereby the elements of the population consist of items generated as though the process would operate indefinitely. Sampling from a Process Some examples of on-going processes, with infinite Some examples of on-going processes, with infinite populations, are: populations, are: parts being manufactured on a production line parts being manufactured on a production line transactions occurring at a bank transactions occurring at a bank telephone calls arriving at a technical help desk telephone calls arriving at a technical help desk customers entering a store customers entering a store

14 14 Slide Sampling from a Process The sampled population is such that a frame cannot The sampled population is such that a frame cannot be constructed. be constructed. In the case of infinite populations, it is impossible to In the case of infinite populations, it is impossible to obtain a list of all elements in the population. obtain a list of all elements in the population. The random number selection procedure cannot be The random number selection procedure cannot be used for infinite populations. used for infinite populations.

15 15 Slide Sampling from a Process n A random sample from an infinite population is a sample selected such that the following conditions sample selected such that the following conditions are satisfied. are satisfied. Each of the sampled elements is independent. Each of the sampled elements is independent. Each of the sampled elements follows the same Each of the sampled elements follows the same probability distribution as the elements in the probability distribution as the elements in the population. population.

16 16 Slide s is the point estimator of the population standard s is the point estimator of the population standard deviation . deviation . s is the point estimator of the population standard s is the point estimator of the population standard deviation . deviation . In point estimation we use the data from the sample In point estimation we use the data from the sample to compute a value of a sample statistic that serves to compute a value of a sample statistic that serves as an estimate of a population parameter. as an estimate of a population parameter. In point estimation we use the data from the sample In point estimation we use the data from the sample to compute a value of a sample statistic that serves to compute a value of a sample statistic that serves as an estimate of a population parameter. as an estimate of a population parameter. Point Estimation We refer to as the point estimator of the population We refer to as the point estimator of the population mean . mean . We refer to as the point estimator of the population We refer to as the point estimator of the population mean . mean . is the point estimator of the population proportion p. is the point estimator of the population proportion p. Point estimation is a form of statistical inference. Point estimation is a form of statistical inference.

17 17 Slide Recall that St. Andrew’s College received 900 Recall that St. Andrew’s College received 900 applications from prospective students. The application form contains a variety of information including the individual’s Scholastic Aptitude Test (SAT) score and whether or not the individual desires on-campus housing. n Example: St. Andrew’s College Point Estimation At a meeting in a few hours, the Director of At a meeting in a few hours, the Director of Admissions would like to announce the average SAT score and the proportion of applicants that want to live on campus, for the population of 900 applicants.

18 18 Slide Point Estimation n Example: St. Andrew’s College However, the necessary data on the applicants have However, the necessary data on the applicants have not yet been entered in the college’s computerized database. So, the Director decides to estimate the values of the population parameters of interest based on sample statistics. The sample of 30 applicants selected earlier with Excel’s RAND function will be used.

19 19 Slide n Excel Value Worksheet (Sorted) Note: Rows 10-31 are not shown. Point Estimation Using Excel AB 1 Applicant Number 2 3 4 5 6 7 8 9 Random Number 0.00027 0.00192 0.00303 0.00481 0.00538 0.00583 0.00649 0.00667 12 773 408 58 116 185 510 394 CD SAT Score On-Campus Housing 1207No 1143Yes 1091Yes 1108No 1227Yes 982Yes 1363Yes 1108No

20 20 Slide as Point Estimator of  as Point Estimator of  n as Point Estimator of p Point Estimation Note: Different random numbers would have identified a different sample which would have resulted in different point estimates. s as Point Estimator of  s as Point Estimator of 

21 21 Slide n Population Mean SAT Score n Population Standard Deviation for SAT Score n Population Proportion Wanting On-Campus Housing Once all the data for the 900 applicants were entered Once all the data for the 900 applicants were entered in the college’s database, the values of the population parameters of interest were calculated. Point Estimation

22 22 Slide PopulationParameterPointEstimatorPointEstimateParameterValue  = Population mean SAT score SAT score 10901097  = Population std. deviation for deviation for SAT score SAT score 80 s = Sample std. s = Sample std. deviation for deviation for SAT score SAT score75.2 p = Population pro- portion wanting portion wanting campus housing campus housing.72.68 Summary of Point Estimates Obtained from a Simple Random Sample = Sample mean = Sample mean SAT score SAT score = Sample pro- = Sample pro- portion wanting portion wanting campus housing campus housing

23 23 Slide Practical Advice The target population is the population we want to The target population is the population we want to make inferences about. make inferences about. The target population is the population we want to The target population is the population we want to make inferences about. make inferences about. Whenever a sample is used to make inferences about Whenever a sample is used to make inferences about a population, we should make sure that the targeted a population, we should make sure that the targeted population and the sampled population are in close population and the sampled population are in close agreement. agreement. Whenever a sample is used to make inferences about Whenever a sample is used to make inferences about a population, we should make sure that the targeted a population, we should make sure that the targeted population and the sampled population are in close population and the sampled population are in close agreement. agreement. The sampled population is the population from The sampled population is the population from which the sample is actually taken. which the sample is actually taken. The sampled population is the population from The sampled population is the population from which the sample is actually taken. which the sample is actually taken.

24 24 Slide n Process of Statistical Inference The value of is used to make inferences about the value of . The sample data provide a value for the sample mean. A simple random sample of n elements is selected from the population. Population with mean  = ? Sampling Distribution of

25 25 Slide The sampling distribution of is the probability The sampling distribution of is the probability distribution of all possible values of the sample mean. Sampling Distribution of where:  = the population mean  = the population mean E ( ) =  Expected Value of Expected Value of

26 26 Slide Sampling Distribution of We will use the following notation to define the We will use the following notation to define the standard deviation of the sampling distribution of.  = the standard deviation of  = the standard deviation of the population n = the sample size N = the population size Standard Deviation of Standard Deviation of

27 27 Slide Sampling Distribution of Finite Population Infinite Population is referred to as the standard error of the is referred to as the standard error of the mean. mean. A finite population is treated as being A finite population is treated as being infinite if n / N <.05. infinite if n / N <.05. is the finite population is the finite population correction factor. correction factor. Standard Deviation of Standard Deviation of

28 28 Slide When the population has a normal distribution, the sampling distribution of is normally distributed for any sample size. In cases where the population is highly skewed or outliers are present, samples of size 50 may be needed. In most applications, the sampling distribution of can be approximated by a normal distribution whenever the sample is size 30 or more. Sampling Distribution of

29 29 Slide Sampling Distribution of The sampling distribution of can be used to provide probability information about how close the sample mean is to the population mean .

30 30 Slide SamplingDistributionof for SAT Scores n Example: St. Andrew’s College Sampling Distribution of

31 31 Slide What is the probability that a simple random What is the probability that a simple random sample of 30 applicants will provide an estimate of the population mean SAT score that is within +/  10 of the actual population mean  ? n Example: St. Andrew’s College Sampling Distribution of In other words, what is the probability that will In other words, what is the probability that will be between 1080 and 1100?

32 32 Slide Step 1: Calculate the z -value at the upper endpoint of the interval. the interval. z = (1100  1090)/14.6=.68 P ( z <.68) =.7517 Step 2: Find the area under the curve to the left of the upper endpoint. upper endpoint. Sampling Distribution of n Example: St. Andrew’s College

33 33 Slide Cumulative Probabilities for the Standard Normal Distribution the Standard Normal Distribution Sampling Distribution of n Example: St. Andrew’s College

34 34 Slide 10901100 Area =.7517 Sampling Distribution of n Example: St. Andrew’s College SamplingDistributionof for SAT Scores

35 35 Slide Step 3: Calculate the z -value at the lower endpoint of the interval. the interval. Step 4: Find the area under the curve to the left of the lower endpoint. lower endpoint. z = (1080  1090)/14.6= -.68 P ( z < -.68) =.2483 Sampling Distribution of n Example: St. Andrew’s College

36 36 Slide Sampling Distribution of for SAT Scores 10801090 Area =.2483 n Example: St. Andrew’s College SamplingDistributionof for SAT Scores

37 37 Slide Sampling Distribution of for SAT Scores Step 5: Calculate the area under the curve between the lower and upper endpoints of the interval. the lower and upper endpoints of the interval. P (-.68 < z <.68) = P ( z <.68)  P ( z < -.68) =.7517 .2483 =.5034 The probability that the sample mean SAT score will be between 1080 and 1100 is: P (1080 < < 1100) =.5034 n Example: St. Andrew’s College

38 38 Slide 110010801090 Sampling Distribution of for SAT Scores Area =.5034 n Example: St. Andrew’s College SamplingDistributionof for SAT Scores

39 39 Slide Relationship Between the Sample Size and the Sampling Distribution of and the Sampling Distribution of Suppose we select a simple random sample of 100 Suppose we select a simple random sample of 100 applicants instead of the 30 originally considered. applicants instead of the 30 originally considered. E ( ) =  regardless of the sample size. In our E ( ) =  regardless of the sample size. In our example, E ( ) remains at 1090. example, E ( ) remains at 1090. Whenever the sample size is increased, the standard Whenever the sample size is increased, the standard error of the mean is decreased. With the increase error of the mean is decreased. With the increase in the sample size to n = 100, the standard error of in the sample size to n = 100, the standard error of the mean is decreased from 14.6 to: the mean is decreased from 14.6 to: n Example: St. Andrew’s College

40 40 Slide Relationship Between the Sample Size and the Sampling Distribution of and the Sampling Distribution of With n = 30, With n = 100, n Example: St. Andrew’s College

41 41 Slide Recall that when n = 30, P (1080 < < 1100) =.5034. Recall that when n = 30, P (1080 < < 1100) =.5034. Relationship Between the Sample Size and the Sampling Distribution of and the Sampling Distribution of We follow the same steps to solve for P (1080 < We follow the same steps to solve for P (1080 < < 1100) when n = 100 as we showed earlier when < 1100) when n = 100 as we showed earlier when n = 30. n = 30. Now, with n = 100, P (1080 < < 1100) =.7888. Now, with n = 100, P (1080 < < 1100) =.7888. Because the sampling distribution with n = 100 has a Because the sampling distribution with n = 100 has a smaller standard error, the values of have less smaller standard error, the values of have less variability and tend to be closer to the population variability and tend to be closer to the population mean than the values of with n = 30. mean than the values of with n = 30. n Example: St. Andrew’s College

42 42 Slide Relationship Between the Sample Size and the Sampling Distribution of and the Sampling Distribution of110010801090 Area =.7888 n Example: St. Andrew’s College SamplingDistributionof for SAT Scores

43 43 Slide Other Sampling Methods n Stratified Random Sampling n Cluster Sampling n Systematic Sampling n Convenience Sampling n Judgment Sampling

44 44 Slide The population is first divided into groups of The population is first divided into groups of elements called strata. elements called strata. The population is first divided into groups of The population is first divided into groups of elements called strata. elements called strata. Stratified Random Sampling Each element in the population belongs to one and Each element in the population belongs to one and only one stratum. only one stratum. Each element in the population belongs to one and Each element in the population belongs to one and only one stratum. only one stratum. Best results are obtained when the elements within Best results are obtained when the elements within each stratum are as much alike as possible each stratum are as much alike as possible (i.e. a homogeneous group). (i.e. a homogeneous group). Best results are obtained when the elements within Best results are obtained when the elements within each stratum are as much alike as possible each stratum are as much alike as possible (i.e. a homogeneous group). (i.e. a homogeneous group).

45 45 Slide Stratified Random Sampling A simple random sample is taken from each stratum. A simple random sample is taken from each stratum. Formulas are available for combining the stratum Formulas are available for combining the stratum sample results into one population parameter sample results into one population parameter estimate. estimate. Formulas are available for combining the stratum Formulas are available for combining the stratum sample results into one population parameter sample results into one population parameter estimate. estimate. Advantage: If strata are homogeneous, this method Advantage: If strata are homogeneous, this method is as “precise” as simple random sampling but with is as “precise” as simple random sampling but with a smaller total sample size. a smaller total sample size. Advantage: If strata are homogeneous, this method Advantage: If strata are homogeneous, this method is as “precise” as simple random sampling but with is as “precise” as simple random sampling but with a smaller total sample size. a smaller total sample size. Example: The basis for forming the strata might be Example: The basis for forming the strata might be department, location, age, industry type, and so on. department, location, age, industry type, and so on. Example: The basis for forming the strata might be Example: The basis for forming the strata might be department, location, age, industry type, and so on. department, location, age, industry type, and so on.

46 46 Slide Cluster Sampling The population is first divided into separate groups The population is first divided into separate groups of elements called clusters. of elements called clusters. The population is first divided into separate groups The population is first divided into separate groups of elements called clusters. of elements called clusters. Ideally, each cluster is a representative small-scale Ideally, each cluster is a representative small-scale version of the population (i.e. heterogeneous group). version of the population (i.e. heterogeneous group). Ideally, each cluster is a representative small-scale Ideally, each cluster is a representative small-scale version of the population (i.e. heterogeneous group). version of the population (i.e. heterogeneous group). A simple random sample of the clusters is then taken. A simple random sample of the clusters is then taken. All elements within each sampled (chosen) cluster All elements within each sampled (chosen) cluster form the sample. form the sample. All elements within each sampled (chosen) cluster All elements within each sampled (chosen) cluster form the sample. form the sample.

47 47 Slide Cluster Sampling Advantage: The close proximity of elements can be Advantage: The close proximity of elements can be cost effective (i.e. many sample observations can be cost effective (i.e. many sample observations can be obtained in a short time). obtained in a short time). Advantage: The close proximity of elements can be Advantage: The close proximity of elements can be cost effective (i.e. many sample observations can be cost effective (i.e. many sample observations can be obtained in a short time). obtained in a short time). Disadvantage: This method generally requires a Disadvantage: This method generally requires a larger total sample size than simple or stratified larger total sample size than simple or stratified random sampling. random sampling. Disadvantage: This method generally requires a Disadvantage: This method generally requires a larger total sample size than simple or stratified larger total sample size than simple or stratified random sampling. random sampling. Example: A primary application is area sampling, Example: A primary application is area sampling, where clusters are city blocks or other well-defined where clusters are city blocks or other well-defined areas. areas. Example: A primary application is area sampling, Example: A primary application is area sampling, where clusters are city blocks or other well-defined where clusters are city blocks or other well-defined areas. areas.

48 48 Slide Systematic Sampling If a sample size of n is desired from a population If a sample size of n is desired from a population containing N elements, we might sample one containing N elements, we might sample one element for every n / N elements in the population. element for every n / N elements in the population. If a sample size of n is desired from a population If a sample size of n is desired from a population containing N elements, we might sample one containing N elements, we might sample one element for every n / N elements in the population. element for every n / N elements in the population. We randomly select one of the first n / N elements We randomly select one of the first n / N elements from the population list. from the population list. We randomly select one of the first n / N elements We randomly select one of the first n / N elements from the population list. from the population list. We then select every n / N th element that follows in We then select every n / N th element that follows in the population list. the population list. We then select every n / N th element that follows in We then select every n / N th element that follows in the population list. the population list.

49 49 Slide Systematic Sampling This method has the properties of a simple random This method has the properties of a simple random sample, especially if the list of the population sample, especially if the list of the population elements is a random ordering. elements is a random ordering. This method has the properties of a simple random This method has the properties of a simple random sample, especially if the list of the population sample, especially if the list of the population elements is a random ordering. elements is a random ordering. Advantage: The sample usually will be easier to Advantage: The sample usually will be easier to identify than it would be if simple random sampling identify than it would be if simple random sampling were used. were used. Advantage: The sample usually will be easier to Advantage: The sample usually will be easier to identify than it would be if simple random sampling identify than it would be if simple random sampling were used. were used. Example: Selecting every 100 th listing in a telephone Example: Selecting every 100 th listing in a telephone book after the first randomly selected listing book after the first randomly selected listing Example: Selecting every 100 th listing in a telephone Example: Selecting every 100 th listing in a telephone book after the first randomly selected listing book after the first randomly selected listing

50 50 Slide Convenience Sampling It is a nonprobability sampling technique. Items are It is a nonprobability sampling technique. Items are included in the sample without known probabilities included in the sample without known probabilities of being selected. of being selected. It is a nonprobability sampling technique. Items are It is a nonprobability sampling technique. Items are included in the sample without known probabilities included in the sample without known probabilities of being selected. of being selected. Example: A professor conducting research might use Example: A professor conducting research might use student volunteers to constitute a sample. student volunteers to constitute a sample. Example: A professor conducting research might use Example: A professor conducting research might use student volunteers to constitute a sample. student volunteers to constitute a sample. The sample is identified primarily by convenience. The sample is identified primarily by convenience.

51 51 Slide Advantage: Sample selection and data collection are Advantage: Sample selection and data collection are relatively easy. relatively easy. Advantage: Sample selection and data collection are Advantage: Sample selection and data collection are relatively easy. relatively easy. Disadvantage: It is impossible to determine how Disadvantage: It is impossible to determine how representative of the population the sample is. representative of the population the sample is. Disadvantage: It is impossible to determine how Disadvantage: It is impossible to determine how representative of the population the sample is. representative of the population the sample is. Convenience Sampling

52 52 Slide Judgment Sampling The person most knowledgeable on the subject of the The person most knowledgeable on the subject of the study selects elements of the population that he or study selects elements of the population that he or she feels are most representative of the population. she feels are most representative of the population. The person most knowledgeable on the subject of the The person most knowledgeable on the subject of the study selects elements of the population that he or study selects elements of the population that he or she feels are most representative of the population. she feels are most representative of the population. It is a nonprobability sampling technique. It is a nonprobability sampling technique. Example: A reporter might sample three or four Example: A reporter might sample three or four senators, judging them as reflecting the general senators, judging them as reflecting the general opinion of the senate. opinion of the senate. Example: A reporter might sample three or four Example: A reporter might sample three or four senators, judging them as reflecting the general senators, judging them as reflecting the general opinion of the senate. opinion of the senate.

53 53 Slide Judgment Sampling Advantage: It is a relatively easy way of selecting a Advantage: It is a relatively easy way of selecting a sample. sample. Advantage: It is a relatively easy way of selecting a Advantage: It is a relatively easy way of selecting a sample. sample. Disadvantage: The quality of the sample results Disadvantage: The quality of the sample results depends on the judgment of the person selecting the depends on the judgment of the person selecting the sample. sample. Disadvantage: The quality of the sample results Disadvantage: The quality of the sample results depends on the judgment of the person selecting the depends on the judgment of the person selecting the sample. sample.


Download ppt "1 1 Slide Sampling and Sampling Distributions Sampling Distribution of Sampling Distribution of Introduction to Sampling Distributions Introduction to."

Similar presentations


Ads by Google