HYPOTHESIS TESTING class of “Experimental Methods of Physics” Mikhail Yurov Kyungpook National University May 9 th, 2005.

Slides:



Advertisements
Similar presentations
Introduction to Hypothesis Testing
Advertisements

Introductory Mathematics & Statistics for Business
Tests of Hypotheses Based on a Single Sample
“Students” t-test.
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
EPIDEMIOLOGY AND BIOSTATISTICS DEPT Esimating Population Value with Hypothesis Testing.
Evaluating Hypotheses Chapter 9 Homework: 1-9. Descriptive vs. Inferential Statistics n Descriptive l quantitative descriptions of characteristics ~
Hypothesis testing & Inferential Statistics
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 9-1 Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests Basic Business Statistics.
BCOR 1020 Business Statistics Lecture 21 – April 8, 2008.
Inference about a Mean Part II
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Statistics for Business and Economics 7 th Edition Chapter 9 Hypothesis Testing: Single.
Aaker, Kumar, Day Seventh Edition Instructor’s Presentation Slides
Inferences About Process Quality
PSY 307 – Statistics for the Behavioral Sciences
Statistics for Managers Using Microsoft® Excel 5th Edition
Chapter 9: Introduction to the t statistic
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 8 Tests of Hypotheses Based on a Single Sample.
Statistical Analysis. Purpose of Statistical Analysis Determines whether the results found in an experiment are meaningful. Answers the question: –Does.
Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.
Aaker, Kumar, Day Ninth Edition Instructor’s Presentation Slides
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 9 Hypothesis Testing.
Chapter 10 Hypothesis Testing
© 2008 McGraw-Hill Higher Education The Statistical Imagination Chapter 9. Hypothesis Testing I: The Six Steps of Statistical Inference.
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 9-1 Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests Business Statistics,
Introduction to Hypothesis Testing for μ Research Problem: Infant Touch Intervention Designed to increase child growth/weight Weight at age 2: Known population:
Statistical Techniques I
Fundamentals of Hypothesis Testing: One-Sample Tests
1/2555 สมศักดิ์ ศิวดำรงพงศ์
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.
Go to Index Analysis of Means Farrokh Alemi, Ph.D. Kashif Haqqi M.D.
Chapter 9 Large-Sample Tests of Hypotheses
F OUNDATIONS OF S TATISTICAL I NFERENCE. D EFINITIONS Statistical inference is the process of reaching conclusions about characteristics of an entire.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Inferential Statistics.
Copyright © 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 22 Using Inferential Statistics to Test Hypotheses.
Topics: Statistics & Experimental Design The Human Visual System Color Science Light Sources: Radiometry/Photometry Geometric Optics Tone-transfer Function.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Psy B07 Chapter 4Slide 1 SAMPLING DISTRIBUTIONS AND HYPOTHESIS TESTING.
10.2 Tests of Significance Use confidence intervals when the goal is to estimate the population parameter If the goal is to.
1 Lecture note 4 Hypothesis Testing Significant Difference ©
IE241: Introduction to Hypothesis Testing. We said before that estimation of parameters was one of the two major areas of statistics. Now let’s turn to.
1 Psych 5500/6500 The t Test for a Single Group Mean (Part 1): Two-tail Tests & Confidence Intervals Fall, 2008.
Chapter 8 Introduction to Hypothesis Testing ©. Chapter 8 - Chapter Outcomes After studying the material in this chapter, you should be able to: 4 Formulate.
Lecture 16 Section 8.1 Objectives: Testing Statistical Hypotheses − Stating hypotheses statements − Type I and II errors − Conducting a hypothesis test.
Essential Question:  How do scientists use statistical analyses to draw meaningful conclusions from experimental results?
1 Chapter 9 Hypothesis Testing. 2 Chapter Outline  Developing Null and Alternative Hypothesis  Type I and Type II Errors  Population Mean: Known 
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 8-1 Chapter 8 Fundamentals of Hypothesis Testing: One-Sample Tests Statistics.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 8 Hypothesis Testing.
Chap 8-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 8 Introduction to Hypothesis.
Interval Estimation and Hypothesis Testing Prepared by Vera Tabakova, East Carolina University.
Statistical Inference for the Mean Objectives: (Chapter 9, DeCoursey) -To understand the terms: Null Hypothesis, Rejection Region, and Type I and II errors.
Copyright ©2013 Pearson Education, Inc. publishing as Prentice Hall 9-1 σ σ.
Chap 8-1 Fundamentals of Hypothesis Testing: One-Sample Tests.
Stats Lunch: Day 3 The Basis of Hypothesis Testing w/ Parametric Statistics.
© Copyright McGraw-Hill 2004
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
Course Overview Collecting Data Exploring Data Probability Intro. Inference Comparing Variables Relationships between Variables Means/Variances Proportions.
One Sample Inf-1 In statistical testing, we use deductive reasoning to specify what should happen if the conjecture or null hypothesis is true. A study.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University.
Chapter 9: Introduction to the t statistic. The t Statistic The t statistic allows researchers to use sample data to test hypotheses about an unknown.
Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Statistics for Business and Economics 8 th Edition Chapter 9 Hypothesis Testing: Single.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 7 Inferences Concerning Means.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Chapter 9 Introduction to the t Statistic
Tom.h.wilson Department of Geology and Geography West Virginia University Morgantown, WV.
CONCEPTS OF HYPOTHESIS TESTING
Chapter 9 Hypothesis Testing.
Chapter 9 Hypothesis Testing.
Discrete Event Simulation - 4
Presentation transcript:

HYPOTHESIS TESTING class of “Experimental Methods of Physics” Mikhail Yurov Kyungpook National University May 9 th, 2005

Contents  Introduction  Use of weighted sum of squared deviations  Errors of two sorts  Types of hypothesis testing  Using the z-statistic

Introduction In probability theory, we start with some well defined problem, and calculate from this the possible outcomes of a specific experiment. We thus proceed from theory to the data. In ‘statistics’ we try solve the inverse problem of using the data to enable us to deduce what are the rules or laws relevant for our experiment. The two basic sorts of problems that we deal with in the subject of statistics are hypothesis testing and parameter fitting. In the former, we test whether our data are consistent with a specific theory (which may contain some free parameters) and in the latter we use the data to determine the values of the free parameters.

Logically, hypothesis testing precedes parameter fitting, since if our hypothesis is incorrect, then there is no point in determining the values of the free parameters contained within the hypothesis. In fact, we deal with parameter fitting first, since it is easier to understand. In practice, one often does parameter fitting first anyway; it may be impossible to perform a sensible test of the hypothesis before its free parameters have been set at their optimum values. If the data look inconsistent with this, can we make a numerical estimate indicating how confident we are that the experimental data show that the angular distribution in incorrect? Example Suppose we have data on angular distribution, consisting of a set of values cosθ i for each interaction, where θ i is the angle that the observed particle makes with some fixed direction. We can ask. Are the data consistent with an angular distribution of the form?

Use of weighted sum of squared deviations So, the more fundamental question is of whether our hypothesis concerning the form of the data is correct or not. In fact we will not be able to give ‘yes or not’ answer, but simply to state how confident we are about accepting or rejecting the hypothesis. In simply cases, the hypothesis may consist simply of a particular value for some parameters. The desirability of examining a distribution rather than simply determining parameter when we are hypothesis testing. If we fit either the solid or the dashed distribution in cosθ by an expression [1+b/acos 2 θ], the value of b/a is liable to be close to zero. This does not imply that either distribution is isotropic.

It is preferable to perform distribution testing rather than parameter testing. Distribution are tested by the χ 2 -method. In order to test hypothesis we have to a.Construct S and minimize it with respect to the free parameters b.Determine the number of degrees of freedom ν from ν=b-p where b is the number of bins of the distribution, p is the number of free parameters. c.look up in the relevant set of tables the probability that, for ν degrees of freedom, χ 2 is greater than or equal to our observed value S min.

χ 2 -distribution have the property that the expectation value and the variance σ 2 (χ 2 )=2ν χ2-distribution for various numbers of degrees of freedom ν. As ν increases, so do the mean and variance of the distribution. Thus large values of S min are unlikely, and so our hypothesis is probably wrong. Very small values of S min are also unlikely, and so again something is suspicious.

More useful than the χ 2 distribution itself is F ν (c)=P ν (χ 2 >c) i.e. the probability that, for the given number of degrees of freedom, the value χ 2 will exceed a particular specified value c. Such distributions are available in almost all books on statistics

Example In a cosθ histogram, let’s assume that there are 12 bins and that when we fit the expression N(1+b/acos 2 θ) to the data, we obtain a value of 20.0 for S min. In this case we have ten degrees of freedom (12 bins less two parameters N and b/a). From figure, we see that the probability of getting a value of 20.0 or large is about 3%. If our experiment is repeated many times, and assuming that our hypothesis is correct, then because of fluctuations we will get a larger value of S min than particular one we are considering in a fraction F of experiments.

Errors of two sorts In deciding whether or not to reject a hypothesis, we can make two sorts incorrect decision.  Error of the first kind In this case we reject the hypothesis H when it is in fact correct. This should happen in a well known fraction F of the tests, where F is determined by the maximum accepted value of S min. But if we have biases in our experiment so that the actual value of the answer is incorrect, or if our errors are incorrectly estimated, then such errors of the first kind can happen more or less frequently. The number of errors of the first kind can be reduced simply by increasing the limit on S min above which we reject the hypothesis.

 Error of the second kind In this case we fail to reject the hypothesis when in fact it is false, and some other hypothesis is correct. The value of S min accidentally turns out to be small, even though the hypothesis H (i.e. the theoretical curve y th that is being compared with the data) is incorrect. It is very difficult to estimate how frequent this effect is likely to be; it depends only on the magnitude of the cut for S min but also on the nature of the competing hypothesis. If these are known, then we may be able to predict what distribution they will give for S min and hence how often we will be incorrect in accepting H.

Types of hypothesis testing The hypothesis we are testing may relate to the experiment as a whole, or alternatively it may be used as a selector for subsets of a data samples which satisfy specific criteria.  Hypothesis relates to whole experiment We observe an angular distribution from the decay of a resonance. The question is “Does the resonance have spin zero?”, this would imply that the angular distribution is isotropic. In this case, an error of the first kind is serious and in this example so is an error of the second kind; in the former case, we reject the spin zero case, when it is in fact true, in the latter we accept it when the spin is non-zero. In this experiment, the alternative hypothesis are well defined: if the spin is not zero, it is 1,2,3,.. It may also be possible to calculate angular distribution for these cases, and hence we can deduce how often each of these give a low value for S min.

Example The angular distribution for the decay of a state whose spin we wish to determine. If the spin is zero, the distribution must be isotropic (dashed line). We calculate the value of S min for this hypothesis. There five experimental points and four degrees of freedom for this hypothesis, since the only variable is the normalization. If S min is large than 10, we would reject this hypothesis - the probability that χ 2 for 4 degrees of freedom exceeds 10 is only 5%. In our case S min is 8.7, so the hypothesis is not rejected. This does not necessarily mean that the spin is zero. If it were 1, the predicted decay distribution may be cos 2 θ (dotted curve). The S min ' for this hypothesis is 4.1, which is also below our rejection cut. The errors on our data are so large that we have poor discrimination between these two hypothesis

 Hypothesis used as data selector An experiment may consist of a large set of interaction of a beam of protons with a hydrogen target, in each of which four charged tracks are observed and measured. We test the hypothesis that this interaction are examples of the reaction pp→ppπ + π - The hypothesis is tested by seeing whether the measured direction and momentum of the tracks are consistent with those expected for reaction on the basis of energy and momentum conservation. Here, we are using our hypothesis to check individual sets of data to study with a view to extracting some interesting physics. Errors of the first kind correspond to rejecting a small fraction of genuine examples of reaction. It is not serious; reduction is the size of the data sample due to the rejection of these events should be small

Errors of the second kind correspond to accepting events as examples of reaction when they in fact are produced by some other reaction with four visible charged tracks, for example pp→ppμ + μ - (*) Thus errors of the second kind constitute a potentially more dangerous problem; our data sample is contaminated. The extent of this contamination is difficult to estimate. It will depend on how frequently the reaction (*) produce kinematical configurations resembling those of reaction of interest. Since the μ mass is very close to that of the π, reaction (*) will be difficult to distinguish from primary reaction simply on the basis of measurements of direction and momenta. These contamination are in general reduced by lowering the value of the cut on S min.

Using the z-statistic When σ is known it is possible to describe the form of the distribution of the sample mean as a Z statistic. μ – population mean (either known or hypothesized under H 0 ) Critical Region Critical Region –the portion of the area under the curve which includes those values of a statistic that lead to the rejection of the null hypothesis. The most often used significance levels are 0.01, 0.05, 0.1. For a one-tailed test using z-statistic, these correspond to z-values of 2.33, 1.65, and 1.28 respectively. For a two-tailed test, the critical region of 0.01 is split into two equal outer areas marked by z-values of |2.58|.

Example Given a population with μ=250 and σ=50, what is the probability of drawing a sample of n=100 values whose mean is at least 255? In this case, Z=1.00. Looking at Table of Areas Under the Normal Curve, the given area for Z=1.00 is To its right is (= ) or 15.85% Conclusion There are approximately 16 chances in 100 of obtaining a sample mean 255 from this population when n=100

References L.Lyons, “Statistics for nuclear and particle physics”, Cambridge (1985) William R.Leo “Techniques for Nuclear and Particle Physics Experiments”, Springer-Verlag Berlin Heidelberg (1987)