Hypothesis testing.

Slides:



Advertisements
Similar presentations
Introductory Mathematics & Statistics for Business
Advertisements

Chapter 10: The t Test For Two Independent Samples
Lecture (11,12) Parameter Estimation of PDF and Fitting a Distribution Function.
Chapter 12: Testing hypotheses about single means (z and t) Example: Suppose you have the hypothesis that UW undergrads have higher than the average IQ.
Hypothesis Testing Steps in Hypothesis Testing:
Hypothesis: It is an assumption of population parameter ( mean, proportion, variance) There are two types of hypothesis : 1) Simple hypothesis :A statistical.
Significance Testing Chapter 13 Victor Katch Kinesiology.
1 Matched Samples The paired t test. 2 Sometimes in a statistical setting we will have information about the same person at different points in time.
Review: What influences confidence intervals?
10 Hypothesis Testing. 10 Hypothesis Testing Statistical hypothesis testing The expression level of a gene in a given condition is measured several.
1/55 EF 507 QUANTITATIVE METHODS FOR ECONOMICS AND FINANCE FALL 2008 Chapter 10 Hypothesis Testing.
BCOR 1020 Business Statistics Lecture 21 – April 8, 2008.
T-Tests Lecture: Nov. 6, 2002.
Two Population Means Hypothesis Testing and Confidence Intervals With Unknown Standard Deviations.
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Statistics for Business and Economics 7 th Edition Chapter 9 Hypothesis Testing: Single.
Inferences About Process Quality
Chapter 9 Hypothesis Testing.
5-3 Inference on the Means of Two Populations, Variances Unknown
Definitions In statistics, a hypothesis is a claim or statement about a property of a population. A hypothesis test is a standard procedure for testing.
Getting Started with Hypothesis Testing The Single Sample.
Chapter 9: Introduction to the t statistic
Week 9 October Four Mini-Lectures QMM 510 Fall 2014.
Statistical Analysis. Purpose of Statistical Analysis Determines whether the results found in an experiment are meaningful. Answers the question: –Does.
AM Recitation 2/10/11.
Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.
Hypothesis Testing:.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 9 Hypothesis Testing.
Probability Tables. Normal distribution table Standard normal table Unit normal table It gives values.
Probability Distributions and Test of Hypothesis Ka-Lok Ng Dept. of Bioinformatics Asia University.
Two Sample Tests Ho Ho Ha Ha TEST FOR EQUAL VARIANCES
Chapter 10 Hypothesis Testing
Intermediate Statistical Analysis Professor K. Leppel.
Overview Definition Hypothesis
Jeopardy Hypothesis Testing T-test Basics T for Indep. Samples Z-scores Probability $100 $200$200 $300 $500 $400 $300 $400 $300 $400 $500 $400.
Fundamentals of Hypothesis Testing: One-Sample Tests
Hypothesis testing – mean differences between populations
Statistical Analysis Statistical Analysis
Section 10.1 ~ t Distribution for Inferences about a Mean Introduction to Probability and Statistics Ms. Young.
Copyright © Cengage Learning. All rights reserved. 10 Inferences Involving Two Populations.
T-distribution & comparison of means Z as test statistic Use a Z-statistic only if you know the population standard deviation (σ). Z-statistic converts.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Statistical Inferences Based on Two Samples Chapter 9.
Statistical Power The ability to find a difference when one really exists.
1 Power and Sample Size in Testing One Mean. 2 Type I & Type II Error Type I Error: reject the null hypothesis when it is true. The probability of a Type.
Chapter 9 Hypothesis Testing and Estimation for Two Population Parameters.
Psy B07 Chapter 4Slide 1 SAMPLING DISTRIBUTIONS AND HYPOTHESIS TESTING.
1 Psych 5500/6500 The t Test for a Single Group Mean (Part 1): Two-tail Tests & Confidence Intervals Fall, 2008.
Essential Question:  How do scientists use statistical analyses to draw meaningful conclusions from experimental results?
5.1 Chapter 5 Inference in the Simple Regression Model In this chapter we study how to construct confidence intervals and how to conduct hypothesis tests.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 8 Hypothesis Testing.
© Copyright McGraw-Hill 2000
Two-Sample Hypothesis Testing. Suppose you want to know if two populations have the same mean or, equivalently, if the difference between the population.
Statistical Inference for the Mean Objectives: (Chapter 9, DeCoursey) -To understand the terms: Null Hypothesis, Rejection Region, and Type I and II errors.
Copyright ©2013 Pearson Education, Inc. publishing as Prentice Hall 9-1 σ σ.
Chapter 8 Parameter Estimates and Hypothesis Testing.
Chapter 9: Testing Hypotheses Overview Research and null hypotheses One and two-tailed tests Type I and II Errors Testing the difference between two means.
Welcome to MM570 Psychological Statistics
© Copyright McGraw-Hill 2004
Inferences Concerning Variances
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
Hypothesis Tests u Structure of hypothesis tests 1. choose the appropriate test »based on: data characteristics, study objectives »parametric or nonparametric.
Hypothesis Tests. An Hypothesis is a guess about a situation that can be tested, and the test outcome can be either true or false. –The Null Hypothesis.
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Statistics for Business and Economics 8 th Edition Chapter 9 Hypothesis Testing: Single.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
 What is Hypothesis Testing?  Testing for the population mean  One-tailed testing  Two-tailed testing  Tests Concerning Proportions  Types of Errors.
Copyright © 2009 Pearson Education, Inc t LEARNING GOAL Understand when it is appropriate to use the Student t distribution rather than the normal.
Chapter 10: The t Test For Two Independent Samples.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Chapter 9 Introduction to the t Statistic
When the means of two groups are to be compared (where each group consists of subjects that are not related) then the excel two-sample t-test procedure.
Hypothesis Testing: Hypotheses
Presentation transcript:

Hypothesis testing

Null hypothesis Ho - this hypothesis holds that if the data deviate from the norm in any way, that deviation is due strictly to chance. Alternative hypothesis Ha - the data show something important. Doing decision = accept/reject Ho (the decision centers around null hypothesis)

Errors in hypothesis testing Type I – False Positive Type II – False Negative The probability of Type I error: α The probability of Type II error: β

Test involving sample from a normally distributed population. Because it’s a normal distribution, you use z-scores in the hypothesis test. The z-score here is called test statistics. The test statistics constructed according the above formula holds only for the mean. Tests for other statistics (e.g. variance) use different formulas.

We know about IQ scores: μ = 100, σ = 16 What will be the Ho and Ha? Suppose you think that people living in a particular zip code have higher-than-average IQs. Your data are given in sheet ZIP, test this hypothesis. n = 16, μZIP = 107.75, α = 0.05 We know about IQ scores: μ = 100, σ = 16 What will be the Ho and Ha? Ha: μZIP > 100 Ho: μZIP ≤ 100 Can you reject Ho? - Data are in examples1_stat.xls, sheet ZIP - Because the population is normally distributed, any sample size results in a normally distributed sampling distribution.

What is the value of z that cuts off 5% of the area in a standard normal distribution? It’s exactly 1.645. So what’s the decision? The calculated value, 1.94, exceeds 1.645, so it’s in the rejection region. The decision is to reject Ho.

This hypothesis test is called one tailed (one sided). The rejection region is in one tail of the sampling distribution. A hypothesis test can be one tailed in the other direction. Ha: μZIP < 100 Ho: μZIP ≥ 100 What is the critical value? -1.645 http://www.emathzone.com/tutorials/basic-statistics

Test can be also two-tailed. The rejection region is in both tails of the Ho sampling distribution. Ho: μZIP = 100 Ha: μZIP ≠ 100 What is the critical value now? Find z-score that cuts 2.5% from right (=1.96) and from left (=-1.96). 1.94 does not exceed 1.96, we do not reject Ho. http://www.emathzone.com/tutorials/basic-statistics

Using one tailed test we rejected Ho, while using two tailed test we did not!! A two tailed test indicates that you’re looking for a difference between the sample mean and the null-hypothesis mean, but you don’t know in which direction. A one tailed test shows that you have a pretty good idea of how the difference should come out. For practical purposes, this means you should try to have enough knowledge to be able to specify a one tailed test.

z-test in Excel ZTEST Do now: examples2.xlsx | ZIP provide sample IQ data, null hypothesis value, σ (if omitted, s is used) p-value is returned If p-value < α, reject Ho. Will you reject Ho or not? p-value for one tailed test: 0.026 (reject)

This is the result of ZTEST

For one tailed test you reject Ho. What if you do two tailed test? Critical value for one tailed test α = 0.05 Our actual value (red line, p-value = 0.026) is in the rejection region of one tailed test (0.026 < 0.05). However, it is outside rejection region for two tailed test. To see this, you must compare 0.026 > 0.025. Or you can 2x multiply this equation → 0.052 > 0.05. So if you have α set to 0.05, and you get p-value for one sided test, you get p-value for two sided test doubling the one sided p-value. - p-value for two tailed test: 2*0.026=0.052 (accept)

t for one In the real world you typically don’t have the luxury of working with such well-defined populations as results of IQ test. Real world: small samples you often don’t know the population parameters When that’s the case, you use the sample data to estimate the population standard deviation you treat the sampling distribution of the mean as a t-distribution You use t as a test statistic

The formula for the test statistic with DF = n – 1. The higher the DF, the more closely the t-distribution resembles the normal distribution.

Is companie’s claim correct or not? Ho, Ha? Company claims their vacuum cleaner averages four defects per unit. A consumer group believes this average is higher. The consumer group takes a sample of 9 cleaners and finds an average of 7 defects, with a standard deviation of 3.16. Is companie’s claim correct or not? Ho, Ha? Ho: μ ≤ 4 Ha: μ > 4 And what else is missing in defining the hypothesis? α = 0.05

Now calculate t test statistic Can you reject Ho? Get critical value from tables or TINV. Use Excel TDIST returns p-value reject Ho - reject Ho

Testing a variance The family of distributions for the test is called chi-square - χ2 The formula for test statistics - n is the number of scores in the sample, s2 is the sample variance, and σ2 is the population variance specified in Ho. With this test, you have to assume that what you’re measuring has a normal distribution.

Solve the following example using CHIDIST. You produce a part of some machine that has to be a certain length with at most a standard deviation of 1.5 cm. After measuring a sample of 26 parts, you find a standard deviation of 1.8 cm. Is your process producing these parts OK? Ho: σ2 ≤ 2.25 (remember to square the “at-most” standard deviation of 1.5 cm) Ha: σ2 > 2.25 α = 0.05 - Notice I said standard deviation. This allows me to speak in terms of centimeters. If I said variance the units would be square centimeters. p-value = 0.0716. Do not reject Ho.

Two sample hypothesis testing Compare one sample with another. Usually, this involves tests of hypotheses about population means. You can also test hypotheses about population variances. Here’s an example. Imagine a new training technique designed to increase IQ. Take a sample of 25 people and train them under the new technique. Take another sample of 25 people and give them no special training. Suppose that the sample mean for the new technique is 107, and for the no-training sample it’s 101.2. Did the technique really increased IQ?

Same principles: Ho (no difference between means), Ha, α one-tailed test Ho: μ1 – μ2 = 0, Ha: μ1 – μ2 > 0 Ho: μ1 – μ2 = 0, Ha: μ1 – μ2 < 0 two-tailed test Ho: μ1 – μ2 = 0, Ha: μ1 – μ2 ≠ 0 The zero is typical case, but it’s possible to test for any value.

The first sample in the pair always has the same size, and the second sample in the pair always has the same size. The two sample sizes are not necessarily equal.

CLT strikes again If the samples are large, the sampling distribution of the difference between means is approximately a normal distribution. If the populations are normally distributed, the sampling distribution is a normal distribution even if the samples are small. The mean of the sampling distribution The standard deviation of the sampling distribution (standard error of the difference between means)

- the sampling distribution along with its parameters, as specified by the Central Limit Theorem

Because CLT says that the sampling distribution is approximately normal for large samples (or for small samples from normally distributed populations), you use the z-score as your test statistic. i.e. you perform a z-test. The z test statistics:

Solve the following. Imagine a new training technique designed to increase IQ. Take a sample of 25 people and train them under the new technique. Take another sample of 25 people and give them no special training. Suppose that the sample mean for the new technique sample is 107, and for the no-training sample it’s 101.2. Did the technique really increased IQ?

Ho: μ1 – μ2 = 0, Ha: μ1 – μ2 > 0, α = 0.05 The IQ is known to have a standard deviation of 16, and I assume that standard deviation would be the same in the population of people trained on the new technique. Use either NORMSDIST (supply 1.28, you get p-value = 1-0.899=0.101) or NORMSINV (probability = 0.95, you get critical value equaling to 1.645). Accept Ho.

Excel provides a tool z-Test: Two Sample for Means (Data | Data Analysis) Do now IQ_Test sheet

Variable variance is 162 = 256 (16 is population standard deviation of IQ test distribution)

t for Two The previous example involves a situation you rarely encounter - known population variances. Not knowing the variances takes the CLT out of play. This means that you can’t use the normal distribution as an approximation of the sampling distribution of the difference between means. Instead, you use the t-distribution. You perform a t-test.

Unknown variances lead to two possibilities for hypothesis testing: although the variances are unknown, you have reason to assume they’re equal you cannot assume they’re equal

t for Two – equal variances Put sample variances together to estimate a population variance – pooling DF

FarKlempt Robotics is trying to choose between two machines to produce a component for its new microrobot. Speed is of the essence, so they have each machine produce ten copies of the component, and time each production run. Which machine should they choose? Do now using Data Analysis, Mechine_speed sheet.

Ho: μ1 - μ2 = 0, Ha: μ1 - μ2 ≠ 0, α = 0.05 This is a two-tailed test, because we don’t know in advance which machine might be faster. Get critical value using TINV (+-2.10) or p-value using TDIST (0.0252). Result: reject Ho.

Do now – Machines example in examples2.xlsx | Machine_speed The worksheet function TTEST eliminates the muss, fuss, and bother of working through the formulas for the t-test. Do now – Machines example in examples2.xlsx | Machine_speed It’s more desirable to use the equal variances t-test, which typically provides more degrees of freedom than the unequal variances t-test.

Do now – Data|Data Analysis, use t-Test: Two-Sample Assuming Equal Variances

t for Two – unequal variances In the case of unequal variances, the t distribution with (N1-1) + (N2-1) DF is not as close an approximation to the sampling distribution. DF must be reduced, fairly involved formulas are used to do this. A pooled estimate is not appropriate. t-test is calculated as

Testing two variances classic: Ho: σ12 = σ22, Ha: σ12 ≠≥≤ σ22, α=0.05 When you test two variances, you don’t subtract one from the other. Instead, you divide one by the other to calculate the test statistic. This statistics is called F-ratio, and you’re doing F-test.

The family of distributions for the test is called the F-distribution. Each member of the family is associated with two values of DF (each DF is n - 1)! And it makes a difference which DF is in the numerator and which DF is in the denominator.

Excel: FTEST, FDIST, FINV, F-Test: Two-Sample for Variances Do now One use of the F-distribution is in conjunction with the t-test for independent samples. Before you do the t-test, you use F to help decide whether to assume equal variances or unequal variances in the samples. Excel: FTEST, FDIST, FINV, F-Test: Two-Sample for Variances Do now FarKlempt Robotics produces 10 parts with Machine 1 and finds a sample variance of .60 cm2. They produce 15 parts with Machine 2 and find a sample variance of .44 cm2. Are these variances same? Data are in the examples2.xlsx | Machine_var

Do not reject Ho.

Estimate variances from data using VAR. FDIST value is exactly ½ of FTEST. (FDIST is one-tailed, FTEST is two-tailed)

It finds a critical value. Probability is 0.025, because two tailed test is with α = 0.05.

For future use

β power α http://www.intuitor.com/statistics/CurveApplet.html The Figure also shows α and β. These, as I mention earlier, are the probabilities of decision errors. The area that corresponds to α is in the H0 distribution. I shaded it in dark gray. It represents the probability that a sample mean comes from the H0 distribution, but it’s so extreme that you reject H0. -Where you set the critical value determines α. In most hypothesis testing, you set α at .05. This means that you’re willing to tolerate a Type I error (incorrectly rejecting H0) 5 percent of the time. Graphically, the critical value cuts off 5 percent of the area of the sampling distribution. - The area that corresponds to β is in the H1 distribution. This area represents the probability that a sample mean comes from the H1 distribution, but it’s close enough to the center of the H0 distribution that you don’t reject H0. You don’t get to set β. The size of this area depends on the separation between the means of the two distributions, and that’s up to the world we live in - not up to you. power α

Theoretically, when you test a null hypothesis versus an alternative hypothesis, each hypothesis corresponds to a separate sampling distribution. When you do a hypothesis test, you never know which distribution produces the results. You work with a sample mean - a point on the horizontal axis. It’s your job to decide which distribution the sample mean is part of. You set up a critical value - a decision criterion. If the sample mean is on one side of the critical value, you reject Ho. If not, you don’t.