Statistical inference: distribution, hypothesis testing

Slides:



Advertisements
Similar presentations
Hypothesis Testing Goal: Make statement(s) regarding unknown population parameter values based on sample data Elements of a hypothesis test: Null hypothesis.
Advertisements

Inferential Statistics
Hypothesis Testing A hypothesis is a claim or statement about a property of a population (in our case, about the mean or a proportion of the population)
HYPOTHESIS TESTING Four Steps Statistical Significance Outcomes Sampling Distributions.
Stat 112 – Notes 3 Homework 1 is due at the beginning of class next Thursday.
Chapter 3 Hypothesis Testing. Curriculum Object Specified the problem based the form of hypothesis Student can arrange for hypothesis step Analyze a problem.
BCOR 1020 Business Statistics Lecture 18 – March 20, 2008.
5-3 Inference on the Means of Two Populations, Variances Unknown
Statistical inference: CLT, confidence intervals, p-values
Review for Exam 2 Some important themes from Chapters 6-9 Chap. 6. Significance Tests Chap. 7: Comparing Two Groups Chap. 8: Contingency Tables (Categorical.
Inference about Population Parameters: Hypothesis Testing
AM Recitation 2/10/11.
1 © Lecture note 3 Hypothesis Testing MAKE HYPOTHESIS ©
Hypothesis Testing (Statistical Significance). Hypothesis Testing Goal: Make statement(s) regarding unknown population parameter values based on sample.
Chapter 6 Introduction to Statistical Inference. Introduction Goal: Make statements regarding a population (or state of nature) based on a sample of measurements.
1 Power and Sample Size in Testing One Mean. 2 Type I & Type II Error Type I Error: reject the null hypothesis when it is true. The probability of a Type.
Hypothesis testing Chapter 9. Introduction to Statistical Tests.
STA Statistical Inference
Chapter 8 McGrew Elements of Inferential Statistics Dave Muenkel Geog 3000.
LECTURE 19 THURSDAY, 14 April STA 291 Spring
Learning Objectives In this chapter you will learn about the t-test and its distribution t-test for related samples t-test for independent samples hypothesis.
● Final exam Wednesday, 6/10, 11:30-2:30. ● Bring your own blue books ● Closed book. Calculators and 2-page cheat sheet allowed. No cell phone/computer.
Topic 7 - Hypothesis tests based on a single sample Sampling distribution of the sample mean - pages Basics of hypothesis testing -
Confidence intervals are one of the two most common types of statistical inference. Use a confidence interval when your goal is to estimate a population.
CHAPTER 17: Tests of Significance: The Basics
Large sample CI for μ Small sample CI for μ Large sample CI for p
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 8 Hypothesis Testing.
1 Chapter 8 Introduction to Hypothesis Testing. 2 Name of the game… Hypothesis testing Statistical method that uses sample data to evaluate a hypothesis.
CHAPTER 15: Tests of Significance The Basics ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
Fall 2002Biostat Statistical Inference - Confidence Intervals General (1 -  ) Confidence Intervals: a random interval that will include a fixed.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Ex St 801 Statistical Methods Inference about a Single Population Mean.
Chapter 9: Hypothesis Tests Based on a Single Sample 1.
Statistical Inference Drawing conclusions (“to infer”) about a population based upon data from a sample. Drawing conclusions (“to infer”) about a population.
Hypothesis Testing Steps for the Rejection Region Method State H 1 and State H 0 State the Test Statistic and its sampling distribution (normal or t) Determine.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
Hypothesis Tests for 1-Proportion Presentation 9.
Data Analysis Patrice Koehl Department of Biological Sciences
More on Inference.
Hypothesis Testing I The One-sample Case
8-1 of 23.
P-values.
STAT 312 Chapter 7 - Statistical Intervals Based on a Single Sample
CHAPTER 9 Testing a Claim
Hypothesis Testing: Hypotheses
When we free ourselves of desire,
Week 11 Chapter 17. Testing Hypotheses about Proportions
Introduction to Inference
Inferences Regarding Population Central Values
More on Inference.
P-value Approach for Test Conclusion
Chapter 9 Hypothesis Testing.
CHAPTER 9 Testing a Claim
Introduction to Inference
Problems: Q&A chapter 6, problems Chapter 6:
Statistical inference
Chapter 9: Hypothesis Tests Based on a Single Sample
Essential Statistics Introduction to Inference
CHAPTER 9 Testing a Claim
Interval Estimation and Hypothesis Testing
STA 291 Spring 2008 Lecture 18 Dustin Lueker.
Basic Practice of Statistics - 3rd Edition Introduction to Inference
Intro to Confidence Intervals Introduction to Inference
CS639: Data Management for Data Science
CHAPTER 9 Testing a Claim
CHAPTER 9 Testing a Claim
Statistical Test A test of significance is a formal procedure for comparing observed data with a claim (also called a hypothesis) whose truth we want to.
CHAPTER 9 Testing a Claim
STA 291 Spring 2008 Lecture 17 Dustin Lueker.
Statistical inference
Presentation transcript:

Statistical inference: distribution, hypothesis testing

Distribution of a correlation coefficient? Computer simulation… 1. Specify the true correlation coefficient Correlation coefficient = 0.15 2. Select a random sample of 100 virtual men from the population. 3. Calculate the correlation coefficient for the sample. 4. Repeat steps (2) and (3) 15,000 times 5. Explore the distribution of the 15,000 correlation coefficients.

Distribution of a correlation coefficient… Normally distributed! Mean = 0.15 (true correlation) Standard error = 0.10

Distribution of a correlation coefficient in general… 1. Shape of the distribution Normally distributed for large samples T-distribution for small samples (n<100) 2. Mean = true correlation coefficient (r) 3. Standard error 

Many statistics follow normal (or t-distributions)… Means/difference in means T-distribution for small samples Proportions/difference in proportions Regression coefficients Natural log of the odds ratio

Recall: 68-95-99. 7 rule for normal distributions Recall: 68-95-99.7 rule for normal distributions! These is a 95% chance that the sample mean will fall within two standard errors of the true mean Mean Mean + 2 Std error =68.6 Mean - 2 Std error=55.4 To be precise, 95% of observations fall between Z=-1.96 and Z= +1.96 (so the “2” is a rounded number)…

95% confidence interval Thus, for normally distributed statistics, the formula for the 95% confidence interval is: sample statistic  2 x (standard error)

Simulation of 20 studies of 100 men… Vertical line indicates the true mean (62) 95% confidence intervals for the mean vitamin D for each of the simulated studies. Only 1 confidence interval missed the true mean.

The P-value P-value is the probability that we would have seen our data (or something more unexpected) just by chance if the null hypothesis (null value) is true. Small p-values mean the null value is unlikely given our data. Our data are so unlikely given the null hypothesis (<<1/10,000) that I’m going to reject the null hypothesis! (Don’t want to reject our data!)

P-value<.0001 means: The probability of seeing what you saw or something more extreme if the null hypothesis is true (due to chance)<.0001 P(empirical data/null hypothesis) <.0001

The P-value By convention, p-values of <.05 are often accepted as “statistically significant” in the medical literature; but this is an arbitrary cut-off. A cut-off of p<.05 means that in about 5 of 100 experiments, a result would appear significant just by chance (“Type I error”).

Summary: Hypothesis Testing The Steps: 1. Define your hypotheses (null, alternative) 2. Specify your null distribution 3. Do an experiment 4. Calculate the p-value of what you observed 5. Reject or fail to reject (~accept) the null hypothesis

Hypothesis Testing Null hypothesis - Statement regarding the value(s) of unknown parameter(s). Typically will imply no association between explanatory and response variables in our applications (will always contain an equality) Alternative hypothesis - Statement contradictory to the null hypothesis (will always contain an inequality) Test statistic - Quantity based on sample data and null hypothesis used to test between null and alternative hypotheses Rejection region - Values of the test statistic for which we reject the null in favor of the alternative hypothesis

Hypothesis Testing Goal: Keep a, b reasonably small

Sampling Distribution of Difference in Means In large samples, the difference in two sample means is approximately normally distributed: Under the null hypothesis, m1-m2=0 and: s12 and s22 are unknown and estimated by s12 and s22

Elements of a Hypothesis Test Test Statistic - Difference between the Sample means, scaled to number of standard deviations (standard errors) from the null difference of 0 for the Population means: Rejection Region - Set of values of the test statistic that are consistent with HA, such that the probability it falls in this region when H0 is true is a (we will always set a=0.05)

P-value (aka Observed Significance Level) P-value - Measure of the strength of evidence the sample data provides against the null hypothesis: P(Evidence This strong or stronger against H0 | H0 is true)

2-Sided Tests H0: m1-m2 = 0 HA: m1-m2  0 Test statistic is the same as before Decision Rule: Conclude m1-m2 > 0 if zobs  za/2 (a=0.05  za/2=1.96) Conclude m1-m2 < 0 if zobs  -za/2 (a=0.05  -za/2= -1.96) Do not reject m1-m2 = 0 if -za/2  zobs  za/2 P-value: 2P(Z |zobs|)

Power of a Test Power - Probability a test rejects H0 (depends on m1- m2) H0 True: Power = P(Type I error) = a H0 False: Power = 1-P(Type II error) = 1-b Example: H0: m1- m2 = 0 HA: m1- m2 > 0 s12 = s22 = 25 n1 = n2 = 25 Decision Rule: Reject H0 (at a=0.05 significance level) if: