Hypothesis testing. Classical hypothesis testing is a statistical method that appeared in the first third of the 20 th Century, alongside the “modern”

Slides:



Advertisements
Similar presentations
Introduction to Hypothesis Testing
Advertisements

Statistics Hypothesis Testing.
Chapter 9 Hypothesis Testing Understandable Statistics Ninth Edition
Inference Sampling distributions Hypothesis testing.
Our goal is to assess the evidence provided by the data in favor of some claim about the population. Section 6.2Tests of Significance.
Chapter 10 Section 2 Hypothesis Tests for a Population Mean
Introduction to Hypothesis Testing
Evaluating Hypotheses Chapter 9. Descriptive vs. Inferential Statistics n Descriptive l quantitative descriptions of characteristics.
SADC Course in Statistics Comparing Means from Independent Samples (Session 12)
Cal State Northridge  320 Ainsworth Sampling Distributions and Hypothesis Testing.
Lecture 2: Thu, Jan 16 Hypothesis Testing – Introduction (Ch 11)
Introduction to Hypothesis Testing
Evaluating Hypotheses Chapter 9 Homework: 1-9. Descriptive vs. Inferential Statistics n Descriptive l quantitative descriptions of characteristics ~
Ch. 9 Fundamental of Hypothesis Testing
Chapter 12 Inferring from the Data. Inferring from Data Estimation and Significance testing.
Inference about Population Parameters: Hypothesis Testing
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 8 Tests of Hypotheses Based on a Single Sample.
Example 10.1 Experimenting with a New Pizza Style at the Pepperoni Pizza Restaurant Concepts in Hypothesis Testing.
1 Economics 173 Business Statistics Lectures 3 & 4 Summer, 2001 Professor J. Petry.
Hypothesis Testing – Introduction
Statistical Techniques I
Section 9.1 Introduction to Statistical Tests 9.1 / 1 Hypothesis testing is used to make decisions concerning the value of a parameter.
Copyright © Cengage Learning. All rights reserved. 8 Tests of Hypotheses Based on a Single Sample.
Testing Hypotheses Tuesday, October 28. Objectives: Understand the logic of hypothesis testing and following related concepts Sidedness of a test (left-,
Copyright © 2006 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Statistical Inference Decision Making (Hypothesis Testing) Decision Making (Hypothesis Testing) A formal method for decision making in the presence of.
Example 10.1 Experimenting with a New Pizza Style at the Pepperoni Pizza Restaurant Concepts in Hypothesis Testing.
Slide Slide 1 Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley. Section 8-4 Testing a Claim About a Mean:  Known Created by.
Chapter 10 Hypothesis Testing
STA Statistical Inference
Significance Tests: THE BASICS Could it happen by chance alone?
Hypothesis Testing Hypothesis Testing Topic 11. Hypothesis Testing Another way of looking at statistical inference in which we want to ask a question.
Agresti/Franklin Statistics, 1 of 122 Chapter 8 Statistical inference: Significance Tests About Hypotheses Learn …. To use an inferential method called.
The Practice of Statistics Third Edition Chapter 10: Estimating with Confidence Copyright © 2008 by W. H. Freeman & Company Daniel S. Yates.
Hypotheses tests for means
Chapter 9 Tests of Hypothesis Single Sample Tests The Beginnings – concepts and techniques Chapter 9A.
Hypothesis and Test Procedures A statistical test of hypothesis consist of : 1. The Null hypothesis, 2. The Alternative hypothesis, 3. The test statistic.
Lecture 16 Section 8.1 Objectives: Testing Statistical Hypotheses − Stating hypotheses statements − Type I and II errors − Conducting a hypothesis test.
Introduction to Inferece BPS chapter 14 © 2010 W.H. Freeman and Company.
Economics 173 Business Statistics Lecture 4 Fall, 2001 Professor J. Petry
Statistical Inference for the Mean Objectives: (Chapter 9, DeCoursey) -To understand the terms: Null Hypothesis, Rejection Region, and Type I and II errors.
Copyright ©2013 Pearson Education, Inc. publishing as Prentice Hall 9-1 σ σ.
Information Technology and Decision Making Information Technology and Decision Making Example 10.1 Experimenting with a New Pizza Style at the Pepperoni.
Hypothesis Testing An understanding of the method of hypothesis testing is essential for understanding how both the natural and social sciences advance.
Slide Slide 1 Section 8-4 Testing a Claim About a Mean:  Known.
Fall 2002Biostat Statistical Inference - Confidence Intervals General (1 -  ) Confidence Intervals: a random interval that will include a fixed.
Copyright © 2006 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
"Classical" Inference. Two simple inference scenarios Question 1: Are we in world A or world B?
Copyright © Cengage Learning. All rights reserved. 12 Analysis of Variance.
Logic and Vocabulary of Hypothesis Tests Chapter 13.
1 URBDP 591 A Lecture 12: Statistical Inference Objectives Sampling Distribution Principles of Hypothesis Testing Statistical Significance.
Inen 460 Lecture 2. Estimation (ch. 6,7) and Hypothesis Testing (ch.8) Two Important Aspects of Statistical Inference Point Estimation – Estimate an unknown.
© Copyright McGraw-Hill 2004
Introduction to Hypothesis Testing
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
Copyright© 1998, Triola, Elementary Statistics by Addison Wesley Longman 1 Testing a Claim about a Mean: Large Samples Section 7-3 M A R I O F. T R I O.
Uncertainty and confidence Although the sample mean,, is a unique number for any particular sample, if you pick a different sample you will probably get.
Today: Hypothesis testing p-value Example: Paul the Octopus In 2008, Paul the Octopus predicted 8 World Cup games, and predicted them all correctly Is.
Slide 20-1 Copyright © 2004 Pearson Education, Inc.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
© 2010 Pearson Prentice Hall. All rights reserved Chapter Hypothesis Tests Regarding a Parameter 10.
Chapter 9 Hypothesis Testing Understanding Basic Statistics Fifth Edition By Brase and Brase Prepared by Jon Booze.
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Statistics for Business and Economics 7 th Edition Chapter 9 Hypothesis Testing: Single.
Unit 5: Hypothesis Testing
Hypothesis Testing – Introduction
Chapter 6 Hypothesis tests.
Two-sided p-values (1.4) and Theory-based approaches (1.5)
Section 10.2: Tests of Significance
Statistical Test A test of significance is a formal procedure for comparing observed data with a claim (also called a hypothesis) whose truth we want to.
Testing a Claim About a Mean:  Known
Presentation transcript:

Hypothesis testing

Classical hypothesis testing is a statistical method that appeared in the first third of the 20 th Century, alongside the “modern” conception of a “scientific” theory. In Popper’s view, a theory not falsifiable is a theory not scientific. The philosopher of science, Karl Popper ( ) asserted that it was the scientist’s responsibility to construct statements that would be either consistent or inconsistent with a scientific theory. In his view, science progresses by repeatedly “testing” aspects of theories against observation; falsified theories are those that after much experimentation / observation have not successfully survived these tests.

Ronald Fisher Jersey Neyman Egon Pearson (1890 – 1962) (1894 – 1981) (1895 – 1980) Three individuals are responsible for developing the statistical analog to Popper’s falsificationism: Ronald Fisher, Jerzy Neyman, and Egon Pearson.

Statistical hypothesis testing, the statistical methodology analogous to falsificationalism, proceeds by first summarizing the scientists’ observations in numeric form: i.e. calculating statistics. In theory, the statistics chosen for calculation would aid the scientist by creating hypotheses in numeric form that correspond to scientific hypotheses about what should happen if a statement or theory were true, and what should happen if a statement or theory were false. Some set of values of the statistic(s) would be regarded as enough information to falsify a statement or theory, and a different set of values would be regarded as not enough (yet) information to falsify a statement.

6 An example: The Iron Butterfly Theory

7 Magnetic fields affect the orientation of a wide variety of animals. * Homing pigeon (Columba livia domestica) * European robin (Erithacus rubecula) * Indigo bunting (Passerina cyanea)

8 One well-known migratory animal is the monarch butterfly (Danaus plexippus) These creatures migrate over tremendous distances, and it is not known how they locate their “winter homes.” One possibility: they use the Earth’s magnetic field.

9 If monarchs have magnetic material in their bodies, this would lend support to the possibility of their use of the earth’s magnetic field in navigation. The existence of magnetic material can be measured by a magnetometer. Unfortunately the magnetometer itself has its own magnetic material at a level of 200 pico-emu’s. To demonstrate that monarchs have magnetic material in their bodies it must be shown that the magnetic intensity of the butterflies exceeds the background 200 pico-emu’s.

10 For purposes of the statistical inference, we pick the theory that can be expressed as a population parameter equal to some constant. If monarchs have no magnetic material in their bodies, we expect the magnetometer to record 200 pico-emu.

11 What counts as evidence against the no-magnetic-material theory? Our alternative theory is that the monarchs do have magnetic material (and thus navigate using the magnetic field of the earth.) Evidence in favor of the magnetic navigation theory would be a higher mean magnetic intensity.

Of course, the actual population mean can’t be observed, so we must rely on a random sample of these butterflies. If a sample mean,, is larger than 200 pico-emu, this might be due to one of two possible reasons: (i) the magnetometer is correctly detecting magnetic material in the butterflies, or (ii) the slings and arrows of outrageous sampling. The mathematics of hypothesis testing is designed to quantify the probability of getting a larger-than-200 sample mean if chance alone is operating. The mechanism for this quantification requires a consideration of all the possible sample means that might result from random sampling, and the probabilities of these means appearing.

The probability distribution of a sample statistic is known as its sampling distribution. The sampling distribution of a statistic is a mathematical model of the possible results of taking a sample, and like all mathematical models has certain driving assumptions. In the case of the sample mean, the assumptions are: 1. A simple random sample of size n is taken from a population of size N. 2. It is reasonable to assume that (a) the population is normal, or (b) the sample size is “large enough” for the Central Limit Theorem to work its magic.

The specification of what counts as sufficient evidence against our theory is done by considering the probabilities associated with possible sample means. Recall our idea that a large sample mean might be due to one of two possible reasons: (i) the magnetometer is correctly detecting magnetic material in the butterflies, or (ii) chance. Our statistical hypothesis testing procedure is all about considering chance as a reason for acquiring a large sample mean. If we can to our satisfaction eliminate chance as the reason, we are left with evidence that the difference is “real” and that our “null hypothesis” is incorrect.

For easy interpretation of results, sample statistics are “standardized” to get “test statistics.” Which test statistic is calculated depends on the sample statistic, and in the case of the sample mean, the test statistic is a “t” statistic:

How large the sample mean must be to provide sufficient evidence is thus translated to the equivalent question, how large must the test statistic (t in this case) be to provide sufficient evidence against the hypothesis? The answer to this question is provided by considering probabilities associated with the distribution of the test statistic.

If the test statistic is so extreme that the probability it would occur “by chance” less than a predetermined value (the “level of significance”) the null hypothesis will be rejected, signifying sufficient evidence that the null hypothesis is discredited. Traditionally the level of significance (α) is set at 5%. The probability associated with the actual sample statistic is the “p-value.”

Unfortunately, in many situations the assumptions of the mathematical model used to construct the sampling distribution are less than credible. 1.We may not actually be sampling from any known or suspected population. 2.(a) We have no reason to believe the population is normal, and/or (b) we have no assurance that the size of the sample is large enough for the Central Limit Theorem to work its magic. But worry not! An alternative approach, not dependent on these assumptions, is available: randomization tests.