Presentation is loading. Please wait.

Presentation is loading. Please wait.

Learning Objectives Determine when to use sampling. Determine the pros and cons of various sampling techniques. Be aware of the different types of errors.

Similar presentations


Presentation on theme: "Learning Objectives Determine when to use sampling. Determine the pros and cons of various sampling techniques. Be aware of the different types of errors."— Presentation transcript:

1 Learning Objectives Determine when to use sampling. Determine the pros and cons of various sampling techniques. Be aware of the different types of errors that can occur in a study. Understand the impact of the Central Limit Theorem on statistical analysis. Use the sampling distributions of the sample mean and sample proportion.

2 Reasons for Sampling Sampling – A means for gathering information about a population without conducting a census – Information is gathered from sample, and inference is made about the population Sampling has advantages over a census – Sampling can save money. – Sampling can save time.

3 Random versus non-random Sampling Nonrandom Sampling - Every unit of the population does not have the same probability of being included in the sample Random sampling - Every unit of the population has the same probability of being included in the sample.

4 Sampling from a Frame A sample is taken from a population list, map, directory, or other source used to represent the population, which is called a frame. Frames can be Telephone Directory, School lists, trade association lists, or even lists sold by brokers. In theory, the target population and the frame are same. But in reality, frames may have over-registration or under-registration.

5 Random Sampling Techniques Simple Random Sampling – basis for other random sampling techniques – Each unit is numbered from 1 to N (the size of the population) – A random number generator can be used to select n items that form the sample – Easier to perform on small populations. The process of numbering all members of a population is cumbersome for large populations

6 Random Sampling Techniques Systematic Random Sampling – Every k th item is selected to produce a sample of size n from a population of size N – Value of k is called sampling cycle – Define k = N/n. Choose one random unit from first k units, and then select every k th unit from there – Used because of convenience and relative ease of administration – A knowledgeable person can easily determine whether a sampling plan has been followed.

7 Systematic Random Sampling: Example Purchase orders for the previous fiscal year are serialized 1 to 10,000 (N = 10,000). A sample of fifty (n = 50) purchases orders is needed for an audit. k = 10,000/50 = 200

8 Systematic Sampling: Example First sample element randomly selected from the first 200 purchase orders. Assume the 45th purchase order was selected. Subsequent sample elements: 45, 245, 445, 645,...

9 Random Sampling Techniques Systematic Random Sampling: Problems – Problems can occur if the data are subject to any periodicity and the sampling interval is in syncopation with it, and sampling will be non- random – Example: a list of 150 college students, actually a merged list of 5 classes with 30 students in each class, the list in each class being ordered with names of top students first and bottom students last. Systematic sampling of every 30 th student may cause selection of all top or bottom or mediocre students i.e. the list is subject to cyclical organizations

10 Random Sampling Techniques Stratified Random Sampling – The population is broken down into strata i.e. homogeneous segments with like characteristics (i.e. men and women OR old, young, and middle-aged people, OR high-income, mid-income and low-income group ) and then Simple/Systematic Random Sampling is done. – Efficient when differences between strata exist – The technique capitalizes on the known homogeneity of subpopulations so that only relatively small samples are required to estimate the characteristic for each stratum or group – Proportionate (% of the sample from each stratum equals % that subpopulation of each stratum is within the whole population)

11 Random Sampling Techniques Cluster (or Area) Sampling – The population is in pre-determined clusters (students in classes, colleges, towns, companies, areas of a city, geographic regions etc.) – The technique identifies clusters that tend to be internally heterogeneous – Each cluster contains a wide variety of elements, and is miniature of the population – A random sample of clusters is chosen and all or some units within the cluster is used as the sample – Advantages: Convenience and Cost, Convenient to obtain and cost of sampling is reduced as the scope of study is reduced to clusters

12 Random Sampling Techniques Important to remember: in Stratified Random Sampling, each stratum is a homogeneous group of population in Cluster Sampling, each cluster is a heterogeneous group of population Important to remember: in Stratified Random Sampling, each stratum is a homogeneous group of population in Cluster Sampling, each cluster is a heterogeneous group of population

13 Convenience (NonRandom) Sampling Non-Random sampling – sampling techniques used to select elements from the population by any mechanism that does not involve a random selection process – These techniques are not desirable for making statistical inferences – Example – choosing members of this class as an accurate representation of all students at our university, selecting the first five people that walk into a store and ask them about their shopping preferences, etc.

14 Non-sampling Errors Non-sampling Errors – all errors that exist other than the variation expected due to random sampling – Missing data, data entry, and analysis errors – Leading questions, poorly conceived concepts, unclear definitions, and defective questionnaires – Response errors occur when people do not know, will not say, or overstate in their answers

15 Proper analysis and interpretation of a sample statistic requires knowledge of its distribution. Select a random sample Process of Inferential Statistics

16 What is a Sampling Distribution? Recall that Statistic has a numerical value that can be computed (observed) once a sample data set is available. Three points are crucial in this context: Because a sample is only a part of the population, the numerical value of a statistic cannot be expected to give us the exact value of the parameter The observed value of a statistic depends on the particular sample that happens to be selected There will be some variability in the observed values of a statistic over different occasions of sampling

17 What is a Sampling Distribution? The value of a Statistic varies in repeated sampling. In other words, a Statistic is a random variable and hence has its own probability distribution Sampling Distribution is the Probability Distribution of a Statistic The qualifier Sampling indicates that the distribution is conceived in the context of repeated sampling from a population The qualifier is often dropped to say the distribution of a statistic Sampling Distribution is the Probability Distribution of a Statistic The qualifier Sampling indicates that the distribution is conceived in the context of repeated sampling from a population The qualifier is often dropped to say the distribution of a statistic

18 Statistic and Sampling Distribution In any given situation, we are often limited to one sample and the corresponding single observed value of a statistic However, over different samples the statistic varies according to its sampling distribution The sampling distribution of a statistic is determined - from the probability distribution f(x) that governs the population - sample size n In any given situation, we are often limited to one sample and the corresponding single observed value of a statistic However, over different samples the statistic varies according to its sampling distribution The sampling distribution of a statistic is determined - from the probability distribution f(x) that governs the population - sample size n

19 Central Limit Theorem Consider taking a sample of size n from a population The sampling distribution of the sample mean is the distribution of the means of repeated samples of size n from a population The central limit theorem states that as the sample size increases, The shape of the distribution becomes a normal distribution (this condition is typically considered to be met when n is at least 30) The variance decreases by a factor of n

20 Sampling from a Normal Population The distribution of sample means is normal for any sample size.

21 z Formula for Sample Means The distribution of sample means is normal for any sample size.

22 Tyre Store Example Suppose that the mean expenditure per customer at a tyre store is $85.00, with a standard deviation of $9.00. If a random sample of 40 customers is taken, what is the probability that the sample average expenditure per customer for this sample will be $87.00 or more? Solution: Because the sample size is greater than 30, the central limit theorem can be used to state that the sample mean is normally distributed and the problem can proceed using the normal distribution calculations. Suppose that the mean expenditure per customer at a tyre store is $85.00, with a standard deviation of $9.00. If a random sample of 40 customers is taken, what is the probability that the sample average expenditure per customer for this sample will be $87.00 or more? Solution: Because the sample size is greater than 30, the central limit theorem can be used to state that the sample mean is normally distributed and the problem can proceed using the normal distribution calculations.

23 Solution to Tyre Store Example

24 Graphic Solution to Tyre Store Example X8785.5000.4207 Z 1.410.5000.4207 Equal Areas of.0793

25 Demonstration Problem 7.1 Suppose that during any hour in a large department store, the average number of shoppers is 448, with a standard deviation of 21 shoppers. What is the probability that a random sample of 49 different shopping hours will yield a sample mean between 441 and 446 shoppers?

26 Demonstration Problem 7.1

27 Graphic Solution for Demonstration Problem 7.1 0 Z-2.33-.67.2486.4901.2415 448 X441446.2486.4901.2415

28 Exercise in R: Normal Distribution The commands you will learn dnorm lines qqnorm qqline rnorm qqnormsim pnorm qnorm Open URL: www.openintro.orgwww.openintro.org Go to Labs in R and select 3-Distributions

29 Exercise in R: Sampling Distribution Here you will learn Central Limit Theorem using the sample() command Open URL: www.openintro.orgwww.openintro.org Go to Labs in R and select 4A – Intro to inference

30

31

32 Z Formula for Sample Proportions

33 Demonstration Problem 7.3 If 10% of a population of parts is defective, what is the probability of randomly selecting 80 parts and finding that 12 or more parts are defective?

34 Solution for Demonstration Problem 7.3

35 Graphic Solution for Demonstration Problem 7.3 p0.150.10.5000.4319 ^ Z1.490.5000.4319


Download ppt "Learning Objectives Determine when to use sampling. Determine the pros and cons of various sampling techniques. Be aware of the different types of errors."

Similar presentations


Ads by Google