Determining the Sample Size. Doing research costs… Power of a hypothesis test generally is an increasing function of sample size. Margin of error is generally.

Slides:



Advertisements
Similar presentations
CHAPTER 21 Inferential Statistical Analysis. Understanding probability The idea of probability is central to inferential statistics. It means the chance.
Advertisements

Sampling: Final and Initial Sample Size Determination
Confidence Intervals for Proportions
Confidence Interval Estimation
Sample size computations Petter Mostad
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 7-1 Introduction to Statistics: Chapter 8 Estimation.
Copyright © 2010 Pearson Education, Inc. Chapter 19 Confidence Intervals for Proportions.
BCOR 1020 Business Statistics Lecture 18 – March 20, 2008.
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 8-1 Chapter 8 Confidence Interval Estimation Business Statistics, A First Course.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 7-1 Chapter 7 Confidence Interval Estimation Statistics for Managers.
Review of normal distribution. Exercise Solution.
Jump to first page HYPOTHESIS TESTING The use of sample data to make a decision either to accept or to reject a statement about a parameter value or about.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 7-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter.
Confidence Interval Estimation
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 11 Section 2 – Slide 1 of 25 Chapter 11 Section 2 Inference about Two Means: Independent.
Many times in statistical analysis, we do not know the TRUE mean of a population of interest. This is why we use sampling to be able to generalize the.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 9 Section 1 – Slide 1 of 39 Chapter 9 Section 1 The Logic in Constructing Confidence Intervals.
Confidence Intervals (Chapter 8) Confidence Intervals for numerical data: –Standard deviation known –Standard deviation unknown Confidence Intervals for.
Albert Morlan Caitrin Carroll Savannah Andrews Richard Saney.
Many times in statistical analysis, we do not know the TRUE mean of a population of interest. This is why we use sampling to be able to generalize the.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 8: Estimating with Confidence Section 8.1 Confidence Intervals: The.
Confidence Intervals for Means. point estimate – using a single value (or point) to approximate a population parameter. –the sample mean is the best point.
Estimates and Sample Sizes Lecture – 7.4
AP Statistics Chap 10-1 Confidence Intervals. AP Statistics Chap 10-2 Confidence Intervals Population Mean σ Unknown (Lock 6.5) Confidence Intervals Population.
PARAMETRIC STATISTICAL INFERENCE
2.6 Confidence Intervals and Margins of Error. What you often see in reports about studies… These results are accurate to within +/- 3.7%, 19 times out.
Lecture 16 Section 8.1 Objectives: Testing Statistical Hypotheses − Stating hypotheses statements − Type I and II errors − Conducting a hypothesis test.
Chap 7-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 7 Estimating Population Values.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Unit 5: Estimating with Confidence Section 10.1 Confidence Intervals: The Basics.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 8-1 Confidence Interval Estimation.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 7-1 4th Lesson Estimating Population Values part 2.
LECTURE 25 THURSDAY, 19 NOVEMBER STA291 Fall
Sampling distributions rule of thumb…. Some important points about sample distributions… If we obtain a sample that meets the rules of thumb, then…
Section 6-3 Estimating a Population Mean: σ Known.
Section 7-3 Estimating a Population Mean: σ Known.
Chap 7-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 7 Estimating Population Values.
1 CHAPTER 4 CHAPTER 4 WHAT IS A CONFIDENCE INTERVAL? WHAT IS A CONFIDENCE INTERVAL? confidence interval A confidence interval estimates a population parameter.
What is a Confidence Interval?. Sampling Distribution of the Sample Mean The statistic estimates the population mean We want the sampling distribution.
Review Normal Distributions –Draw a picture. –Convert to standard normal (if necessary) –Use the binomial tables to look up the value. –In the case of.
© Copyright McGraw-Hill 2004
One Sample Mean Inference (Chapter 5)
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 7-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
9-1 ESTIMATION Session Factors Affecting Confidence Interval Estimates The factors that determine the width of a confidence interval are: 1.The.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 11 Section 3 – Slide 1 of 27 Chapter 11 Section 3 Inference about Two Population Proportions.
INFERENCE Farrokh Alemi Ph.D.. Point Estimates Point Estimates Vary.
10.1 – Estimating with Confidence. Recall: The Law of Large Numbers says the sample mean from a large SRS will be close to the unknown population mean.
Uncertainty and confidence Although the sample mean,, is a unique number for any particular sample, if you pick a different sample you will probably get.
Statistics for Business and Economics 7 th Edition Chapter 7 Estimation: Single Population Copyright © 2010 Pearson Education, Inc. Publishing as Prentice.
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Inference for the Mean of a Population
ESTIMATION.
Introduction to Hypothesis Test – Part 2
Introduction to Inference
CONCEPTS OF ESTIMATION
Introduction to Inference
Introduction to Inference
Discrete Event Simulation - 4
Confidence Intervals: The Basics
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Presentation transcript:

Determining the Sample Size

Doing research costs… Power of a hypothesis test generally is an increasing function of sample size. Margin of error is generally a decreasing function of sample size. Cost of research is generally an increasing function of sample size Who is paying the bills?

Before research project begins…  Formulation of hypotheses.  Choice of significance/confidence levels.  What size of effect are we looking for?  What power do we want?  How many observations do we need?  Can we afford it?

Research Setting You are a statistician working for a manufacturer. Process engineer is investigating the mean amount of time required to complete an assembly task. He wishes to show that the mean assembly time is less than 30 seconds. Past experience with similar studies leads him to believe that the assembly times will be approximately normally distributed, and that the sample range for 1000 observations will be approximately 9 seconds.

 Formulation of Hypotheses: In this case, H a :  < 30 sec., so that H 0 :   30 sec.  Choice of significance/confidence levels: We weigh the possible consequences of making either a Type I error or a Type II error, and decide that we want  = We also want to obtain a 95% confidence interval estimate of the mean assembly time.  The engineer tells us that we want to be able to detect a mean difference of 0.5 sec., i.e., if the mean assembly time is less than 29.5 sec., we should be able to detect it.

 We want to be able to detect this size of effect with probability We also want to be able to estimate the mean assembly time with an interval width of 0.8 sec.

What size sample do we need? Our test statistic is. Under H 0, this statistic has a t(n-1) distribution. What is The distribution under H a ? It is noncentral with noncentrality parameter.

Power analysis If T n-1,  is a random variable having the above non- central t distribution, then we see that the power of the test is the probability that this r.v. will be found to be less than the critical value of the test. Thus the power depends on the true value of , the sample size n, and the true population standard deviation . We know what size of effect we want to be able to detect, but we need to know something about .

“Guess-timating”  If the range of values of assembly times for 1000 observations is 9 sec., and if assembly time is a normally distributed r.v., then we can “guess-timate” that.

Sample Size for Hypothesis Test We then want to find n so that. Can we use SAS to do this calculation? If we run the following program using a range of possible sample sizes, we will be able to solve our problem.

data one; input n; cp = tinv(0.05,n-1); delta = 0.5/(1.5/sqrt(n)); power = probt(cp,n-1,-delta); put cp power; cards; 50 ; run;

Sample Size for Estimation The form of the confidence interval is. We want the margin of error to be 0.4 sec. We then want a value of n to satisfy the following inequality:

. The following SAS program, run for a range of values of n, will help us to solve our problem.

data one; input n; cp = -tinv(0.025,n-1); numg = gamma(n/2); deng = gamma((n-1)/2); width = (2*cp*sqrt(2)*numg*1.5)/(sqrt(n*(n-1))*deng); put n cp width; cards; 50 ; run;

Sample Size We had two criteria for the sample size; one based on the power of the test, the other based on the margin of error of estimation. We choose the larger of the two values of n to be sure that we achieve both the desired power and the desired margin of error.

To obtain our sample sizes, we had to make an assumption about the value of . If our “guess-timate” for  was too small, then our power would be less that the desired value and our margin of error would be larger than the desired value. If our “guess-timate” was too large, then we might be wasting resources with a sample size that would be larger than necessary.