Presentation is loading. Please wait.

Presentation is loading. Please wait.

QMS 6351 Statistics and Research Methods Chapter 7 Sampling and Sampling Distributions Prof. Vera Adamchik.

Similar presentations


Presentation on theme: "QMS 6351 Statistics and Research Methods Chapter 7 Sampling and Sampling Distributions Prof. Vera Adamchik."— Presentation transcript:

1 QMS 6351 Statistics and Research Methods Chapter 7 Sampling and Sampling Distributions Prof. Vera Adamchik

2 Chapter 7 Outline Simple random sampling Point estimation Introduction to sampling distributions Sampling distribution of Other sampling methods

3 Statistical inference The purpose of statistical inference is to obtain information about a population from information contained in a sample. A population is the set of all the elements of interest in a study. A sample is a subset of the population.

4 A parameter is a numerical characteristic of a population. A sample statistic is a numerical characteristic of a sample. We will use a sample statistic in order to judge tentatively or approximately the value of the population parameter.

5 The sample results provide only estimates (that is, rough and approximate values) of the values of the population characteristics. The reason is simply that the sample contains only a portion of the population. With proper sampling methods, the sample results will provide “good” estimates of the population characteristics.

6 Simple random sampling procedure

7 Selecting a sample Sampling from a finite population. Finite populations are often defined by lists such as organization membership roster, class roster, inventory product numbers, etc. Sampling from an infinite population (a process). The population is usually considered infinite if it involves an ongoing process that makes listing or counting every element impossible. For example, parts being manufactured on a production line, customers entering a store, etc.

8 Sampling from a finite population A simple random sample from a finite population of size N is a sample selected such that each possible sample of size n has the same probability of being selected. Replacing each sampled element before selecting subsequent elements is called sampling with replacement. Sampling without replacement is the procedure used most often. In large sampling projects, computer- generated random numbers are often used to automate the sample selection process.

9 St. Andrew’s College received 900 applications for admission in the upcoming year from prospective students. The applicants were numbered, from 1 to 900, as their applications arrived. The Director of Admissions would like to select a simple random sample of 30 applicants. Example: St. Andrew’s College

10 Sampling from a finite population using Excel =RAND() Excel generates a random number between 0 and 1 =RAND()*N Excel generates a random number greater than or equal to 0 but less than or equal N =INT(RAND()*900)

11 In the case of infinite populations, it is impossible to obtain a list of all elements in the population. The random number selection procedure cannot be used for infinite populations. Sampling from an infinite population

12 A simple random sample from an infinite population is a sample selected such that the following conditions are satisfied: Each element selected comes from the same population. Each element is selected independently. Sampling from an infinite population

13 Point estimation

14 In point estimation we use the data from the sample to compute a value of a sample statistic that serves as an estimate of a population parameter. A point estimate is a statistic computed from a sample that gives a single value for the population parameter. An estimator is a rule or strategy for using the data to estimate the parameter.

15 Terminology of point estimation We refer to as the point estimator of the population mean  We refer to as the point estimator of the population standard deviation 

16 Terminology of point estimation We refer to as the point estimator of the population proportion p. The actual numerical value obtained for in a particular sample is called the point estimate of the parameter.

17 Example: St. Andrew’s College Recall that St. Andrew’s College received 900 applications from prospective students. The application form contains a variety of information including the individual’s scholastic aptitude test (SAT) score and whether or not the individual desires on- campus housing. At a meeting in a few hours, the Director of Admissions would like to announce the average SAT score and the proportion of applicants that want to live on campus, for the population of 900 applicants.

18 However, the necessary data on the applicants have not yet been entered in the college’s computerized database. So, the Director decides to estimate the values of the population parameters of interest based on sample statistics. The sample of 30 applicants selected earlier with Excel’s RAND() function will be used. Example: St. Andrew’s College

19 Excel Value Worksheet Note: Rows 10-31 are not shown. Point estimation using Excel A 1 Applicant Number 2 3 4 5 6 7 8 9 12 773 408 58 116 185 510 394 CD SAT Score On-Campus Housing 1107No 1043Yes 991Yes 1008No 1127Yes 982Yes 1163Yes 1008No

20 Point estimates Note: Different random numbers would have identified a different sample which would have resulted in different point estimates.

21 Once all the data for the 900 applicants were entered in the college’s database, the values of the population parameters of interest were calculated. Population parameters

22 PopulationParameterPointEstimatorPointEstimateParameterValue  = Population mean SAT score SAT score 990997  = Population std. deviation for deviation for SAT score SAT score 80 s = Sample std. s = Sample std. deviation for deviation for SAT score SAT score75.2 p = Population pro- portion wanting portion wanting campus housing campus housing.72.68 Summary of point estimates obtained from a simple random sample = Sample mean = Sample mean SAT score SAT score = Sample pro- = Sample pro- portion wanting portion wanting campus housing campus housing

23 Making inferences about a population mean

24 Making inferences about a population mean The value of is used to make inferences about the value of . The sample data provide a value for the sample mean. A simple random sample of n elements is selected from the population. Population with mean  = ?

25 Population vs sampling distribution The population distribution is the probability distribution derived from the information on all elements of a population. The probability distribution of a sample statistic ( ) is called its sampling distribution.

26 Sampling distribution of The sampling distribution of the sample mean ( ) is the probability distribution of all possible values of. We need to know: Expected value of Standard deviation of Form of the sampling distribution of

27 Mean of the sampling distribution of The mean of the sampling distribution of is equal to the mean of the population. Thus, x x

28 Standard deviation of the sampling distribution of Finite population and n/N  0.05 (1) Infinite population (N is unknown); (2) Finite population and n/N 0.05 is referred to as the standard error of the mean. x is the finite population is the finite population correction factor correction factor

29 Two important observations 1. The spread of the sampling distribution of is smaller than the spread of the corresponding population distribution. In other words,. 2. The standard deviation of the sampling distribution of decreases as the sample size increases. x x

30 Form of the sampling distribution of 1. The population has a normal distribution. If the population from which the samples are drawn is normally distributed, then the sampling distribution of the sample mean will also be normally distributed for any sample size. x

31 Form of the sampling distribution of 2. The population is not normally distributed but the sample size is large (n 30). According to the Central Limit Theorem, for a large sample size (n 30), the sampling distribution of the sample mean is approximately normal, irrespective of the shape of the population distribution. In cases where the population is highly skewed or outliers are present, samples of size 50 or more may be needed. x

32 Form of the sampling distribution of 3. The sample size is small (n < 30) and the population is not normally distributed. Use special statistical procedures. x

33 Sampling Distribution of for SAT Scores Sampling distribution of x Example: St. Andrew’s College

34 What is the probability that the sample mean will be between 980 and 1000? In other words, what is the probability that a simple random sample of 30 applicants will provide an estimate of the population mean SAT score that is within +/-10 of the actual population mean ? P (980 < < 1000) = ?

35 1000980990 Area =.5034 SamplingDistributionof for SAT Scores Example: St. Andrew’s College

36 The probability of 0.5034 means that, for a large number of samples of size 30 selected from the population, we can expect that in 50.34% of all cases the sample mean will be within +/-10 of the actual population mean (that is, 980- 1000) and in 49.66% of all cases the sample mean will be further than +/-10 of the actual population mean (that is, below 980 or above 1000). Example: St. Andrew’s College

37 Relationship between the sample size and the sampling distribution of Example: St. Andrew’s College Suppose we select a simple random sample of 100 applicants instead of the 30 originally considered.

38 regardless of the sample size. In our example, E( ) remains at 990. Whenever the sample size is increased, the standard error of the mean is decreased. With the increase in the sample size to n = 100, the standard error of the mean is decreased from 14.6 to:

39 With n = 30, With n = 100, Example: St. Andrew’s College Relationship between the sample size and the sampling distribution of

40 Recall that when n = 30, P(980 < < 1000) =.5034. Now, with n = 100, P(980 < < 1000) =.7888. Because the sampling distribution with n = 100 has a smaller standard error, the values of have less variability and tend to be closer to the population mean than the values of with n = 30. Example: St. Andrew’s College

41 1000980990 Area =.7888 Sampling Distribution of for SAT Scores Example: St. Andrew’s College

42 The probability of 0.7888 means that, for a large number of samples of size100 selected from the population, we can expect that in 78.88% of all cases the sample mean will be within +/-10 of the actual population mean (that is, 980-1000) and in 21.12% of all cases the sample mean will be further than +/-10 of the actual population mean (that is, below 980 or above 1000). Example: St. Andrew’s College

43 Making inferences about a population proportion

44 of is used The value of is used to make inferences p. about the value of p. The sample data provide a value for the proportion. sample proportion. A simple random sample of n elements is selected from the population. Population with proportion p = ? Making inferences about a population proportion

45 Sampling distribution of The sampling distribution of the sample proportion ( ) is the probability distribution of all possible values of. We need to know: Expected value of Standard deviation of Form of the sampling distribution of

46 Mean of the sampling distribution of The mean of the sampling distribution of is equal to the population proportion. Thus,

47 Standard deviation of the sampling distribution of Finite population and n/N  0.05 (1) Infinite population (N is unknown); (2) Finite population and n/N 0.05 is referred to as the standard error of the proportion.

48 Form of the sampling distribution of The sampling distribution of can be approximated by a normal probability distribution whenever the sample size is large. The sample size is considered large whenever the following two conditions are satisfied:

49 For values of p near.50, sample sizes as small as 10 permit a normal approximation. With very small (approaching 0) or very large (approaching 1) values of p, much larger samples are needed. Form of the sampling distribution of

50 For our example, with n = 30 and p =.72, the normal distribution is an acceptable approximation because: n (1 - p ) = 30(.28) = 8.4 > 5 and np = 30(.72) = 21.6 > 5 Example: St. Andrew’s College

51 Sampling Distributionof Example: St. Andrew’s College Sampling distribution of

52 Recall that 72% of the prospective students applying to St. Andrew’s College desire on-campus housing. What is the probability that a simple random sample of 30 applicants will provide an estimate of the population proportion of applicant desiring on-campus housing that is within plus or minus.05 of the actual population proportion? Example: St. Andrew’s College P (.67 < <.77) = ?

53 .77.67.72 Area =.4582 Sampling Distributionof Example: St. Andrew’s College

54 Other sampling methods

55 Stratified random sampling Cluster sampling Systematic sampling Convenience sampling Judgment sampling

56 Stratified random sampling The population is first divided into groups of elements called strata. Each element in the population belongs to one and only one stratum. Best results are obtained when the elements within each stratum are as much alike as possible (i.e. a homogeneous group). A simple random sample is taken from each stratum.

57 Cluster sampling The population is first divided into separate groups of elements called clusters. Ideally, each cluster is a representative small-scale version of the population (i.e. heterogeneous group). A simple random sample of the clusters is then taken. All elements within each sampled (chosen) cluster form the sample.

58 Systematic sampling If a sample size of n is desired from a population containing N elements, we might sample one element for every n/N elements in the population. We randomly select one of the first n/N elements from the population list. We then select every n/Nth element that follows in the population list.

59 Convenience sampling It is a nonprobability sampling technique. Items are included in the sample without known probabilities of being selected. The sample is identified primarily by convenience. Example: A professor conducting research might use student volunteers to constitute a sample.

60 Judgment sampling The person most knowledgeable on the subject of the study selects elements of the population that he or she feels are most representative of the population. It is a nonprobability sampling technique. Example: A reporter might sample three or four senators, judging them as reflecting the general opinion of the senate.


Download ppt "QMS 6351 Statistics and Research Methods Chapter 7 Sampling and Sampling Distributions Prof. Vera Adamchik."

Similar presentations


Ads by Google