Presentation is loading. Please wait.

Presentation is loading. Please wait.

ESTIMATES AND SAMPLE SIZES

Similar presentations


Presentation on theme: "ESTIMATES AND SAMPLE SIZES"— Presentation transcript:

1 ESTIMATES AND SAMPLE SIZES
CHAPTER 7 ESTIMATES AND SAMPLE SIZES

2 ESTIMATION: AN INTRODUCTION
We have come a long way. We started by learning “what is statistics and the two areas of applied statistics.” In Chapter 1, we learned that: Descriptive statistics consists of methods for organizing, displaying, and describing data by using tables, graphs, and summary measures. Inferential statistics consists of methods that used samples to make decisions or predictions about the population. In Chapters 2 and 3, we focused on descriptive statistics and learned how to draw tables, how to graph data, and how to calculate numerical summary measures such as mean, median, mode, variance, and standard deviation. Now in Chapters 7, we will focus on inferential statistics. We begin by discussing estimation.

3 ESTIMATION: AN INTRODUCTION
Definition Estimation is a process for assigning value(s) to a population parameter based on information collected from a sample. There are many real-life examples in which “estimation” is used. A few of them are, for example, to estimate the: Mean of fuel consumption for a particular model car. Proportion of students that completed MAT 12 course with a passing grade for the past 10 years. Proportion of female high school students that dropped out of school because of pregnancy. Percentage of all California lawyers disbarred for committing a criminal offense.

4 ESTIMATION: AN INTRODUCTION
Of course we can conduct a census to find the true mean or proportion of the population in 1 through 4. However, for what we now know about census, it would be: Expensive. Difficult to reach or contact every member of the population. Time consuming. So, because of the problem with census, a representative sample is generally drawn from the population and the appropriate sample statistic is calculated. Then, A value is assigned to the population parameter based on the calculated value of the sample statistic. The value assigned to the population parameter based on the value of sample statistic is called an estimate of the population parameter.

5 ESTIMATION: AN INTRODUCTION
For example, the Mathematics Department draws a sample of 50 students from all students who have taken MAT 12 for the past 10 years. The department records the number of students that passed and failed the course, and calculated the sample proportion, , of students who passed the course to be So, If the department assigns the value of sample proportion, , to the population proportion, p, then 0.65 is called an estimate of p and is called the estimator. Summary Estimation procedure involves: Draw a sample from the population. Collect required information from each element of the sample. Calculate the value of sample statistic. Assign the value to corresponding population parameter. Note: The sample must be a simple random sample.

6 7-2 ESTIMATING A POPULATION PROPORTION
The estimated value of population parameter can either be based on a point estimate or an interval estimate. Point Estimate - Definition A point estimate is the value of sample statistic used to estimate population parameter. So, suppose we used the sample proportion, , as a point estimate of p, then we can say that the proportion of all students that have taken MAT 12 course with a passing grade for the past 10 years is about That is, We discussed in Chapter 6 that the value of sample statistic varies from one sample to another that are of the same size and drawn from the population. Therefore, The value assigned to the population proportion, p , based on a point estimate depends on the sample drawn. The value assigned to population parameter is almost always different from the true value of population parameter.

7 An Interval Estimation
Definition: An interval estimate is an interval build around the point estimate and then a probabilistic statement is made that the built interval contains the corresponding population parameter. Therefore following on to our example, rather than saying that the proportion of all students that have taken MAT 12 in the last 10 years is 0.65, we would: Add and subtract a number to 0.65 to obtain an interval and then Say that the interval contains the population proportion, p. Now, let us add and subtract 0.2 to Then we obtain an interval We state that the population proportion, p, is likely to be contained in the interval 0.45 to 0.85. We also state that the proportion of all students that have taken MAT 12 with a passing grade in the past 10 years is between and .85. The 0.45 is called the lower limit and 0.85 is called the upper limit. The number we subtracted and added to the point estimate is called margin of error.

8 An Interval Estimation
The value of margin of error depends on: Standard deviation, , of the sample proportion, . Level of confidence that we like to attach to the interval. So, The larger is , the greater is margin of error. To ensure that the population proportion is contained in the interval, we have to use a higher confidence level. We add a probabilistic statement so the interval is based on the confidence level. An interval constructed based on the confidence level is called a confidence interval. Confidence interval is defined as

9 An Interval Estimation
The confidence level associated with a confidence interval is defined as

10 7.3-7.4 ESTIMATION OF A POPULATION MEAN:  KNOWN
The three possible cases on how to construct a confidence interval for population mean with known are as follows: We use standard normal distribution to construct the confidence interval for with if: Standard deviation is known. Sample size is small, n<30 Population is normally distributed or at least close to normal distribution provided there is no outliers. Sample size is large, By central limit theorem, the sampling distribution of the sample mean is approximately normal. However, we may not be able to use standard normal distribution if the population distribution is very different from normal distribution.

11 ESTIMATION OF A POPULATION MEAN:  KNOWN
We use a nonparametric method to construct the confidence interval if: Standard deviation is known. Sample size is small, n<30 Population is not normally distributed or is unknown. The rest of this section will deal with Cases I and II. We will not cover the 3rd case. Formula

12 ESTIMATION OF A POPULATION MEAN:  KNOWN
Three Possible Cases

13 ESTIMATION OF A POPULATION MEAN:  KNOWN
Let us revisit the definition of confidence level. Remember that confidence level is the area under the standard normal curve of between two points on both sides and of equal distance from .

14 ESTIMATION OF A POPULATION MEAN:  KNOWN
How to determine z given confidence level To find the 2 locations for z, first the Area between the 2 z’s is Since z1 and z2 are the same distance from the mean, , then the sum of areas to the left of z1 and right of z2 is Since the area to the left of z1 and the area to the right of z2 are equal, then: Using table A-2, we can find the values of z1 and z2 that correspond to the required area. Note that the values of z1 and z2 are the same, but they have opposite signs.

15 ESTIMATION OF A POPULATION MEAN:  KNOWN
Interpretation of confidence level Let us consider 20 samples of the same size taken from the same population. Then, Let us calculate the sample mean, for each sample. Let us then calculate the confidence interval for around each sample mean, , based on a confidence level of 90%. The normal curve of the sampling distribution for is shown to the right. In the context of this example, we say that 90% of the intervals such as for x1 and x2 will include , and 10% such as the interval around x3 will not.

16 ESTIMATION OF A POPULATION MEAN:  KNOWN
Width of a confidence Interval As stated previously, the confidence interval is defined as, z which depends on the confidence level and n because Since is out of control of the investigators, then the width of confidence level can only be controlled by using z and n. Thus, the width is controlled by the following relationships: The value of z increases as the confidence level increases. The value of z decreases as the confidence level decreases. With n remaining constant, the higher the confidence level, the larger the width of a confidence interval. An increase in the sample size causes a decrease in the width of confidence level In conclusion, we can reduce the width of a confidence interval by lowering confidence level or increase sample size.

17 Determining the Sample Size for the Estimation of Mean
Because of the problems associated with conducting a census or even a sample survey, we need to find a way to determine a sample size that will produce required results without wasting unnecessary effort or financial resources on surveying larger sample size. So, to find the appropriate sample size, n, we need: Confidence level Width of a confidence interval So, having a predetermined margin of error, we can find the sample size that will produce the required results. Note that if is not known, one could take a small sample and calculate sample standard deviation, s, and then use the s in lieu of in the formula.

18 ESTIMATION OF A POPULATION MEAN:  KNOWN
Example #1 – Problem 8.10 Find z for each of the following confidence levels a) 90% b) 95% Example #1 – Solution .05 .025 .90 .05 .95 .025 z1 z2 z1 z2

19 ESTIMATION OF A POPULATION MEAN:  KNOWN
Example #2 For a data set obtained from a sample n = 81 and = It is known that = 4.8. a) What is the point estimate of ? b) Make a 95% confidence interval for c) What is the margin of error of estimate for part b? Example #2 – Solution

20 ESTIMATION OF A POPULATION MEAN:  KNOWN
Example #2 – Solution

21 ESTIMATION OF A POPULATION MEAN:  KNOWN
Example #3 The standard deviation for population is = A sample of 25 observations selected from this population gave a mean equal to The population is known to have a normal distribution. a) Make a 99% confidence interval for b) Construct a 95% confidence interval for c) Determine a 90% confidence interval for d) Does the width of the confidence intervals constructed in parts a through c decrease as the confidence level decreases? Explain your answer.

22 ESTIMATION OF A POPULATION MEAN:  KNOWN
Example #3 – Solution

23 ESTIMATION OF A POPULATION MEAN:  KNOWN
Example #3 – Solution

24 ESTIMATION OF A POPULATION MEAN:  KNOWN
Example #3 – Solution

25 ESTIMATION OF A POPULATION MEAN:  KNOWN
Example #4 For a population, the value of the standard deviation is A sample of 32 observations taken from this population produced the following data. a) What is the point estimate of b) Make a 99% confidence interval for c) What is the margin or error of estimate for part b? Example #4 – Solution

26 ESTIMATION OF A POPULATION MEAN:  KNOWN
Example #4 – Solution

27 ESTIMATION OF A POPULATION MEAN:  KNOWN
Example #5 For a population data set, = a) What should the sample size be for a 98% confidence interval for to have a margin of error of estimate equal to 5.50? b) What should the sample size be for a 95% confidence interval for to have a margin of error of estimate equal to 4.25? Example #5 – Solution

28 ESTIMATION OF A POPULATION MEAN:  KNOWN
Example #5 – Solution

29 ESTIMATION OF A POPULATION MEAN:  KNOWN
Example #6 Inside the Box Corporation makes corrugated cardboard boxes. One type of these boxes states that the breaking capacity of this box is 75 pounds. Fifty-five randomly selected such boxes were loaded until they break. The average breaking capacity of these boxes was found to be pounds. Suppose that the standard deviation of the breaking capacities of all such boxes is 2.63 pounds. Calculate a 99% confidence interval for the average breaking capacity of all boxes of this type.

30 ESTIMATION OF A POPULATION MEAN:  KNOWN
Example #6 – Solution

31 ESTIMATION OF A POPULATION MEAN:  NOT KNOWN
The three possible cases on how to construct a confidence interval for population mean when is unknown are as follows: We use t distribution to construct the confidence interval for if: Standard deviation, , is unknown. Sample size is small, n<30 Population is normally distributed. Standard deviation, , is unknown. Sample size is large, We use a nonparametric method to construct the confidence interval for if: Sample size is small, n <30. Population is not normally distributed.

32 ESTIMATION OF A POPULATION MEAN:  NOT KNOWN
Three Possible Cases

33 The t Distribution The t distribution is also called student’s t distribution. It is similar to the normal distribution because it has: A bell-shape curve, A total area of 1.0 under the curve, and A population mean, , of zero It is different from the normal distribution curve because: It has a lower height and wider spread, The units are denoted by t, and It’s population standard deviation, , is defined as df is the degree of freedom, and is defined as the number of observations that can be chosen freely. It is denoted as t distribution depends only one parameter, df . As the sample size becomes larger, the t distribution approaches the standard normal distribution.

34 Figure 8.5 The t distribution for df = 9 and the standard normal distribution.

35 The t Distribution Steps to read t distribution in Table V:
Table A-3 lists t value for a given degree of freedom and an area in the right tail under a t distribution curve. This area is the same as the area in the left tail under the t distribution curve because of symmetry. Steps to read t distribution in Table V: Locate the value of degree of freedom under the column labeled “df”, and draw a horizontal line through the row. Locate the area under one of the columns for areas in the right tail under the t distribution curve, and draw a vertical line through the column. The entry where the horizontal line and vertical line meet is the required t value. For example, let us find a t value for a t distribution with a sample size of 9 and an area of 0.01 in the right rail of the t distribution curve.

36 The t Distribution Example #7 Example #7 – Solution
Find the value of t for t distribution for each of the following, a) Area in the right tail = .05 & df = 12 b) Area in the left tail = .05 & df = 12 Example #7 – Solution

37 The t Distribution Example #8 Example #8 – Solution
For each of the following, find the area in the appropriate tail of the t distribution. a) t = & df = 28 b) t = & df = 58 c) t = & n = 55 Example #8 – Solution

38 Confidence Interval for μ Using the t Distribution
In Section 4.3, we define as However, since is normally unknown, we can estimate a sample standard deviation, s, and use it in lieu of and in place of is calculated as, Therefore, the (1 – α)100% confidence interval for is Note: If df>75, we can either use: The entries in last row of Table V, where , or A normal distribution to approximate the t distribution.

39 Confidence Interval for μ Using the t Distribution
Example #9 Find the value of t from the t distribution table for each of the following. a) Confidence level = 99% & df = 13 b) Confidence level = 95% & n = 36 Example #9 – Solution

40 Confidence Interval for μ Using the t Distribution
Example #10 – Problem 8.47 A sample of 11 observations taken from a normally distributed population produced the following data. a) What is the point estimate of b) Make a 95% confidence interval for c) What is the margin of error of estimate for part b? Example #10 – Solution

41 Confidence Interval for μ Using the t Distribution
Example #10 – Solution x x2 -7.1 50.41 10.3 106.09 8.7 75.69 -3.6 12.96 -6.0 36.00 -7.5 56.25 5.2 27.04 3.7 13.69 9.8 96.04 -4.4 19.36 6.4 40.96

42 Confidence Interval for μ Using the t Distribution
Example #11 A random sample of 16 airline passengers at the Bay City airport showed that the mean time spent waiting in line to check in at the ticket counter was 31 minutes with a standard deviation of 7 minutes. Construct a 99% confidence interval for the mean time spent waiting in line by all passengers at this airport. Assume that such waiting times for all passengers are normally distributed. Example #11 – Solution

43 ESTIMATION OF A POPULATION PROPORTION: LARGE SAMPLES
We already learned that for a large sample size, that is, np>5 and nq > 5, then 1. The sampling distribution of is approximately normally distributed 2. The mean, , of the sampling distribution of is equal to the population proportion 3. The standard deviation, , of the sampling distribution of the sample proportion, , is define as, Since we may not know , we will need to use as an estimate of The Confidence interval for the p = Margin of error =

44 DETERMINING THE SAMPLE SIZE FOR THE ESTIMATION OF PROPORTION
Given the confidence level and the values of and , the sample size that will produce a predetermined maximum of error E of the confidence interval estimate of p is

45 DETERMINING THE SAMPLE SIZE FOR THE ESTIMATION OF PROPORTION
In case the values of and are not known We make the most conservative estimate of the sample size n by using and We take a preliminary sample (of arbitrarily determined size) and calculate and from this sample. Then use these values to find n.

46 Example Example #12 Check if the sample size is large enough to use the normal distribution to make a confidence interval for P for each of the following cases. n=50, =.25, N=160, =.03 Answers: n = (50)(.25)=12.5, and n = (50)(.75)=37.5 so, the sample size is large enough t use the normal distribution. n = (160)(.03)= 4.8 , the sample size is not large enough to use the normal distribution .

47 Example Example #12 A sample of 200 observation selected from a population produced a sample proportion equal to .91. Make a 90% confidence interval for p. Answer: n=200, =.91, =1-.91=.09, = The 90% confidence interval for p is = ( )= =.877 to .943


Download ppt "ESTIMATES AND SAMPLE SIZES"

Similar presentations


Ads by Google