1 Confidence Interval for Population Mean The case when the population standard deviation is unknown (the more common case).

Slides:



Advertisements
Similar presentations
1 T-test for the Mean of a Population: Unknown population standard deviation Here we will focus on two methods of hypothesis testing: the critical value.
Advertisements

1 Confidence Interval for Population Mean The case when the population standard deviation is unknown (the more common case).
1 Difference Between the Means of Two Populations.
Lecture (11,12) Parameter Estimation of PDF and Fitting a Distribution Function.
Statistics Review – Part II Topics: – Hypothesis Testing – Paired Tests – Tests of variability 1.
Simple Linear Regression 1. 2 I want to start this section with a story. Imagine we take everyone in the class and line them up from shortest to tallest.
Chapter 12: Testing hypotheses about single means (z and t) Example: Suppose you have the hypothesis that UW undergrads have higher than the average IQ.
Hypothesis: It is an assumption of population parameter ( mean, proportion, variance) There are two types of hypothesis : 1) Simple hypothesis :A statistical.
1 One Tailed Tests Here we study the hypothesis test for the mean of a population when the alternative hypothesis is an inequality.
Hypothesis Testing IV Chi Square.
1 Multiple Regression Interpretation. 2 Correlation, Causation Think about a light switch and the light that is on the electrical circuit. If you and.
1 Matched Samples The paired t test. 2 Sometimes in a statistical setting we will have information about the same person at different points in time.
1 Test for the Population Proportion. 2 When we have a qualitative variable in the population we might like to know about the population proportion of.
1 Analysis of Variance This technique is designed to test the null hypothesis that three or more group means are equal.
1 Difference Between the Means of Two Populations.
t scores and confidence intervals using the t distribution
1 More Regression Information. 2 3 On the previous slide I have an Excel regression output. The example is the pizza sales we saw before. The first thing.
An Inference Procedure
1 The Basics of Regression Regression is a statistical technique that can ultimately be used for forecasting.
1 Hypothesis Testing In this section I want to review a few things and then introduce hypothesis testing.
1 More about the Confidence Interval of the Population Mean.
1 Confidence Interval for the Population Mean. 2 What a way to start a section of notes – but anyway. Imagine you are at the ground level in front of.
The Normal Distribution
More Simple Linear Regression 1. Variation 2 Remember to calculate the standard deviation of a variable we take each value and subtract off the mean and.
Hypothesis Tests for Means The context “Statistical significance” Hypothesis tests and confidence intervals The steps Hypothesis Test statistic Distribution.
1 T-test for the Mean of a Population: Unknown population standard deviation Here we will focus on two methods of hypothesis testing: the critical value.
Chapter 8 Introduction to Hypothesis Testing
1 Matched Samples The paired t test. 2 Sometimes in a statistical setting we will have information about the same person at different points in time.
“There are three types of lies: Lies, Damn Lies and Statistics” - Mark Twain.
Hypothesis Testing: Two Sample Test for Means and Proportions
An Inference Procedure
P-value Method 2 means, sigmas unknown. Sodium levels are measured in millimoles per liter (mmol/L) and a score between 136 and 145 is considered normal.
Statistical Analysis. Purpose of Statistical Analysis Determines whether the results found in an experiment are meaningful. Answers the question: –Does.
P-value Method 2 proportions. A resident of a small housing complex has a pet monkey who likes to sit out on the porch and smoke cigarettes. Some of the.
Hypothesis Testing:.
Probability Tables. Normal distribution table Standard normal table Unit normal table It gives values.
Two Sample Tests Ho Ho Ha Ha TEST FOR EQUAL VARIANCES
Intermediate Statistical Analysis Professor K. Leppel.
1/2555 สมศักดิ์ ศิวดำรงพงศ์
Section #4 October 30 th Old: Review the Midterm & old concepts 1.New: Case II t-Tests (Chapter 11)
Significance Tests …and their significance. Significance Tests Remember how a sampling distribution of means is created? Take a sample of size 500 from.
Section 10.1 ~ t Distribution for Inferences about a Mean Introduction to Probability and Statistics Ms. Young.
Section 9-4 Hypothesis Testing Means. This formula is used when the population standard deviation is known. Once you have the test statistic, the process.
Go to Index Analysis of Means Farrokh Alemi, Ph.D. Kashif Haqqi M.D.
Welcome to the Unit 8 Seminar Dr. Ami Gates
Statistics and Quantitative Analysis U4320
Chapter 9 Hypothesis Testing and Estimation for Two Population Parameters.
1 Lecture note 4 Hypothesis Testing Significant Difference ©
Welcome to the Unit 5 Seminar for MM305! I hope you have had a good evening so far. I have a lot of information to share with you tonight. We may not be.
1 Psych 5500/6500 The t Test for a Single Group Mean (Part 1): Two-tail Tests & Confidence Intervals Fall, 2008.
Statistics - methodology for collecting, analyzing, interpreting and drawing conclusions from collected data Anastasia Kadina GM presentation 6/15/2015.
Statistics in Biology. Histogram Shows continuous data – Data within a particular range.
Large sample CI for μ Small sample CI for μ Large sample CI for p
Chapter 8 Parameter Estimates and Hypothesis Testing.
Inferences from sample data Confidence Intervals Hypothesis Testing Regression Model.
Ex St 801 Statistical Methods Inference about a Single Population Mean.
Chapter 9: Testing Hypotheses Overview Research and null hypotheses One and two-tailed tests Type I and II Errors Testing the difference between two means.
Aim: How do we use a t-test?
Mystery 1Mystery 2Mystery 3.
Chapter 11: Estimation of Population Means. We’ll examine two types of estimates: point estimates and interval estimates.
Formulating the Hypothesis null hypothesis 4 The null hypothesis is a statement about the population value that will be tested. null hypothesis 4 The null.
The Chi Square Equation Statistics in Biology. Background The chi square (χ 2 ) test is a statistical test to compare observed results with theoretical.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
Welcome to MM570 Psychological Statistics Unit 5 Introduction to Hypothesis Testing Dr. Ami M. Gates.
Copyright © 2009 Pearson Education, Inc t LEARNING GOAL Understand when it is appropriate to use the Student t distribution rather than the normal.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Math 4030 – 10a Tests for Population Mean(s)
Hypothesis Testing: Two Sample Test for Means and Proportions
INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE Test Review: Ch. 7-9
Presentation transcript:

1 Confidence Interval for Population Mean The case when the population standard deviation is unknown (the more common case).

2 Let’s review. When we do not know the population mean we want to use a sample to get a feel for what the population mean might be. From the sample we calculate a sample mean. Since we know in theory that different samples would provide potentially different sample means, we take our one sample mean and build a margin of error around the sample mean. Then we have a level of confidence that the unknown population mean is in the interval we calculated based on the sample. Up to now we have looked at the case where the population standard deviation was known.

3 More review The margin of error I write about on the previous screen is calculated using a value of Z and the standard deviation of the sampling distribution. The values of Z most commonly used are ZConfidence interval % % % The standard deviation of the sampling distribution is the population standard deviation divided by the square root of the sample size.

4 So, the margin of error is Z times the standard deviation of the sampling distribution. The confidence interval is then (sample mean minus margin of error, sample mean plus margin of error). New information When the population standard deviation is not known then we have to modify our work just a little. The standard deviation of the sampling distribution will be called the standard error and will still be calculated similar to above. But we will not use a Z value in the margin of error. We will use a t value.

5 It turns out that when the population standard deviation is not known the sample mean has a t distribution. The t distribution is a lot like the normal distribution, but when we use the t distribution we have to be aware of something called degrees of freedom (df). The main point for us here is that degrees of freedom = sample size minus 1, or df = n – 1. So, if n = 19, df = 18, if n = 11, df = 10, and so on. Now if we want a 95% confidence interval in this case we 1) Calculate sample mean, 2) Calculate sample standard deviation, 3) Calculate standard error as sample standard deviation divided by the square root of the sample size, 4) find our t value in the t-table under the.025 column in the df row = n-l, 5) calculate our margin of error as t times standard error, 6) calculate interval as sample mean minus and plus margin of error.

6 Note we look in the.025 column on the t distribution for a 95% confidence because we would have.025 or 2.5% in the tails of the distribution. If we want a 99% confidence interval we look in the.005 column and if we want a 90% confidence interval we look in the.05 column for similar reasons. Let’s do an example. We have the following sample values from a population: 10, 8, 12, 15, 13, 11, 6, and 5. On the following slide I have a basic Excel printout to calculate the sample mean and sample standard deviation.

7 x 10 sample mean =10 8 sample SD = n =8 13df = a) The point estimate of the population mean is the sample mean = 10. b) The point estimate of the population standard deviation is the sample standard deviation and when you round to two digits you get 3.46 c) To get the 95% confidence interval we need to get the standard error and the t statistic with a upper tail value of.025 and a df = 7. The t value is The standard error is 3.46/sqrt(8) = Thus the margin of error is (2.365)1.22 = The interval is thus (10 – 2.89, ) = (7.11, 12.89) and thus we are 95% confident the population mean is in the interval 7.11 to

Z table and t table 8 If you look in the Z table at a Z = 1.96 you see the value This means.9750 of the possible Z values have values 1.96 or less is a cumulative value..025 is in the upper tail. There is 1 standard normal distribution. But the t distribution is really a family of distributions, where each value of the degrees of freedom defines a new distribution. When you go to the t table in the book you see across the top the values of the upper tail area. When you go to the upper tail area.025 you see in the df infinity row the t value is This means when the df is really big the t and the z distributions are the same. See the similarity with.05 and.005?

Another idea 9 a b c d If I want a 95% confidence interval (and I had the distribution drawn in) b would have area.95/2 and c would also have.95/2. Area a would be.05/2 and the same would work for d. So, if I have a t distribution (or Z) why do I look at.025 when I have a 95% confidence interval? The answer is the table works with the upper tail and since the upper tail is just.025 we look there knowing that the other.025 is in the lower tail. In a confidence interval we want to focus on the middle of the distribution. Say the line I have between b and c is the sample mean and the arrows point to the low and the high end of the interval.

Problem (not in your book) 10 Note when working with a t you have to pick the right column and the right row. The row is the df and equals n – 1. The column to look at is related to the story I had on the previous slide. In a problem we may get 1 – alpha = some decimal. From this alpha = 1 minus some decimal. On the previous slide alpha was split in half in area a and d. We focus on d. a. alpha =.05 alpha/2 =.05/2 =.025. df = 9, the critical t = b. alpha/2 =.01/2 =.005 and df = 9 so critical t = c. alpha/2 =.1/2 =.05 and df = 15 so critical t =

11 T-test for the Mean of a Population: Unknown population standard deviation Here we will focus on two methods of hypothesis testing: the critical value approach and the p-value approach.

12 We saw in the standard deviation in the population known case that when we do not know the true value of the population mean for a quantitative variable an hypothesis test can be carried out utilizing the z calculation (x bar minus mu under Ho:)/standard deviation of the mean. When the population standard deviation, sigma, of the variable is unknown we have to rely on the t distribution. Plus in the calculation of the standard error we will use the sample standard deviation. The t statistic = (x bar minus mu under Ho:)/ standard error of the mean. Let’s work a few problems.

Example 13 For a company when they look at the past they have seen the average dollar amount on an invoice be $120. Over time this will be monitored and they will see if this changes. The question now is about whether or not the population mean is still 120. We will make this the null hypothesis. So we have Ho: μ = 120 and Ha: μ ≠ 120 a) With the critical value approach the value of alpha has to be determined and say we have alpha =.05. When the alternative hypothesis has a not equal to sign we have a two tail test. This means we have.025 in each tail. But since we do not know the population standard deviation we have to use the t distribution. With a sample size of 12 we look in the df = n – 1 = 12 – 1 = 11 row. The critical t’s are thus and Let’s see what this looks like in a graph on the next slide

14 Let’s review what we have done. We have a null and alternative hypothesis. We have an alpha value and a sample size we will use. The critical values of t break up the t distribution into rejection and acceptance of the null hypothesis regions. Our decision rule will be this: If when we take a sample and calculate both a sample mean and the associated t value, called the t test statistic (and I will write tstat), if the tstat is less than the lower critical value or greater than the upper critical value we will reject the null. If the tstat is in the middle of the critical values we do not reject the null. alpha/2 =.025 Upper Critical t = lower Critical t =

Now say we get an actual sample of 12 invoices and we see the sample mean is and the sample standard deviation is The tstat from the sample is ( – 120)/(20.804/sqrt(12)) = Since the value of the tstat is and since this value is in the middle of the critical values we do not reject the null. b) To proceed with the p-value approach to hypothesis testing I would like us to explore the t distribution with df = 11 row. Let’s see this on the next slide.

16 T distribution with DF = is area under curve.10 is area under curve.05 is area under curve. 025 is area under curve. 01 is area under curve. 005 is area under curve Here I have marked off on the t distribution the positive and negative values. On the next slide I reproduce this with the tails colored in for when alpha is picked to be.05.

17 T distribution with DF = is area under curve.10 is area under curve.05 is area under curve. 025 is area under curve. 01 is area under curve. 005 is area under curve So, with alpha =.05 the critical t’s are and Next we take the sample mean and calculate the tstat. Again, in our example we had The occurs here on the number line. This falls between and The tail areas for these two values are 0.10 and 0.25, respectively.

18 The tail area for is thus between 0.10 and This is the basis for the p-value. But, because of the way the t distribution shows up in our book the best we can say about the tail area for the tstat is between 0.10 and Since our alternative hypothesis Ha is a not equal to sign we have to double the tail area for and so we say the p-value is between 0.20 and (A computer or better table would have us see the tail area doubled would be.259 – we do not need that here.) Here is how we use the p-value approach. If the p-value is less than or equal to alpha reject the null, otherwise do not reject the null. In our example the p-value is at least 0.20 which is >.05 so we do not reject the null.

19 Say from a problem we see Ho: μ = 50 and thus H1: μ ≠ 50. Also say x bar = 56 and s = 12 and the sample size is 16. The tstat = (56 – 50)/(12/sqrt16) = 2.00 On the next slide I show what the critical t’s would be in this problem if we wanted an alpha =.10. Note with n = 16 the df = 15.

20 The critical values of t are and If our tstat is outside these two values then we are saying that the sample information is placing us in a low probability area. This makes us suspicious of the null hypothesis and thus we reject it. Our tstat = 2.00 places us in the rejection region. Note if alpha was.05 we would not reject the null (critical values of – and ). The p-value is thus between.05 and.10. alpha =.10/2 Critical t =