Review Statistical inference and test of significance.

Slides:



Advertisements
Similar presentations
Inferential Statistics
Advertisements

Hypothesis Testing A hypothesis is a claim or statement about a property of a population (in our case, about the mean or a proportion of the population)
6-1 Stats Unit 6 Sampling Distributions and Statistical Inference - 1 FPP Chapters 16-18, 20-21, 23 The Law of Averages (Ch 16) Box Models (Ch 16) Sampling.
STATISTICAL INFERENCE PART V
Chapter 10: Hypothesis Testing
Comparing Two Population Means The Two-Sample T-Test and T-Interval.
EPIDEMIOLOGY AND BIOSTATISTICS DEPT Esimating Population Value with Hypothesis Testing.
Stat 301 – Day 28 Review. Last Time - Handout (a) Make sure you discuss shape, center, and spread, and cite graphical and numerical evidence, in context.
Cal State Northridge  320 Ainsworth Sampling Distributions and Hypothesis Testing.
9-1 Hypothesis Testing Statistical Hypotheses Statistical hypothesis testing and confidence interval estimation of parameters are the fundamental.
Hypothesis Tests for Means The context “Statistical significance” Hypothesis tests and confidence intervals The steps Hypothesis Test statistic Distribution.
BCOR 1020 Business Statistics Lecture 21 – April 8, 2008.
Chapter 11: Inference for Distributions
Chapter 9 Hypothesis Testing.
Statistics 270– Lecture 25. Cautions about Z-Tests Data must be a random sample Outliers can distort results Shape of the population distribution matters.
Inference about Population Parameters: Hypothesis Testing
INFERENTIAL STATISTICS – Samples are only estimates of the population – Sample statistics will be slightly off from the true values of its population’s.
Statistical Inference Dr. Mona Hassan Ahmed Prof. of Biostatistics HIPH, Alexandria University.
AM Recitation 2/10/11.
Warm-up Day of 8.1 and 8.2 Quiz and Types of Errors Notes.
1 Economics 173 Business Statistics Lectures 3 & 4 Summer, 2001 Professor J. Petry.
Unit 7b Statistical Inference - 2 Hypothesis Testing Using Data to Make Decisions FPP Chapters 27, 27, possibly 27 &/or 29 Z-tests for means Z-tests.
1/2555 สมศักดิ์ ศิวดำรงพงศ์
Copyright © Cengage Learning. All rights reserved. 13 Linear Correlation and Regression Analysis.
Lesson Carrying Out Significance Tests. Vocabulary Hypothesis – a statement or claim regarding a characteristic of one or more populations Hypothesis.
More About Significance Tests
June 18, 2008Stat Lecture 11 - Confidence Intervals 1 Introduction to Inference Sampling Distributions, Confidence Intervals and Hypothesis Testing.
STATISTICAL INFERENCE PART VII
CHAPTER 16: Inference in Practice. Chapter 16 Concepts 2  Conditions for Inference in Practice  Cautions About Confidence Intervals  Cautions About.
Jan 17,  Hypothesis, Null hypothesis Research question Null is the hypothesis of “no relationship”  Normal Distribution Bell curve Standard normal.
Hypothesis Testing for Proportions
Chapter 9: Testing Hypotheses
6.1 - One Sample One Sample  Mean μ, Variance σ 2, Proportion π Two Samples Two Samples  Means, Variances, Proportions μ 1 vs. μ 2.
Significance Tests: THE BASICS Could it happen by chance alone?
Comparing two sample means Dr David Field. Comparing two samples Researchers often begin with a hypothesis that two sample means will be different from.
9-1 Hypothesis Testing Statistical Hypotheses Definition Statistical hypothesis testing and confidence interval estimation of parameters are.
Learning Objectives In this chapter you will learn about the t-test and its distribution t-test for related samples t-test for independent samples hypothesis.
Associate Professor Arthur Dryver, PhD School of Business Administration, NIDA url:
Essential Statistics Chapter 131 Introduction to Inference.
Lesson Significance Tests: The Basics. Vocabulary Hypothesis – a statement or claim regarding a characteristic of one or more populations Hypothesis.
10.2 Tests of Significance Use confidence intervals when the goal is to estimate the population parameter If the goal is to.
Confidence intervals are one of the two most common types of statistical inference. Use a confidence interval when your goal is to estimate a population.
Hypotheses tests for means
1 Chapter 10: Introduction to Inference. 2 Inference Inference is the statistical process by which we use information collected from a sample to infer.
Statistics - methodology for collecting, analyzing, interpreting and drawing conclusions from collected data Anastasia Kadina GM presentation 6/15/2015.
Statistical Inference Statistical Inference involves estimating a population parameter (mean) from a sample that is taken from the population. Inference.
Introduction to Inferece BPS chapter 14 © 2010 W.H. Freeman and Company.
10.1: Confidence Intervals Falls under the topic of “Inference.” Inference means we are attempting to answer the question, “How good is our answer?” Mathematically:
Ch 10 – Intro To Inference 10.1: Estimating with Confidence 10.2 Tests of Significance 10.3 Making Sense of Statistical Significance 10.4 Inference as.
Fall 2002Biostat Statistical Inference - Confidence Intervals General (1 -  ) Confidence Intervals: a random interval that will include a fixed.
Introduction to Statistical Inference Jianan Hui 10/22/2014.
AP Statistics Section 11.1 B More on Significance Tests.
Business Statistics for Managerial Decision Farideh Dehkordi-Vakil.
AP Statistics Unit 5 Addie Lunn, Taylor Lyon, Caroline Resetar.
STA Lecture 221 !! DRAFT !! STA 291 Lecture 22 Chapter 11 Testing Hypothesis – Concepts of Hypothesis Testing.
WS 2007/08Prof. Dr. J. Schütze, FB GW KI 1 Hypothesis testing Statistical Tests Sometimes you have to make a decision about a characteristic of a population.
Statistical Inference Drawing conclusions (“to infer”) about a population based upon data from a sample. Drawing conclusions (“to infer”) about a population.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
Learning Objectives After this section, you should be able to: The Practice of Statistics, 5 th Edition1 DESCRIBE the shape, center, and spread of the.
PEP-PMMA Training Session Statistical inference Lima, Peru Abdelkrim Araar / Jean-Yves Duclos 9-10 June 2007.
The inference and accuracy We learned how to estimate the probability that the percentage of some subjects in the sample would be in a given interval by.
The accuracy of averages We learned how to make inference from the sample to the population: Counting the percentages. Here we begin to learn how to make.
Tests of Significance We use test to determine whether a “prediction” is “true” or “false”. More precisely, a test of significance gets at the question.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
Introduction Sample surveys involve chance error. Here we will study how to find the likely size of the chance error in a percentage, for simple random.
Review Law of averages, expected value and standard error, normal approximation, surveys and sampling.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. 1 FINAL EXAMINATION STUDY MATERIAL III A ADDITIONAL READING MATERIAL – INTRO STATS 3 RD EDITION.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
4-1 Statistical Inference Statistical inference is to make decisions or draw conclusions about a population using the information contained in a sample.
Hypothesis Testing: Hypotheses
Presentation transcript:

Review Statistical inference and test of significance

Basic concepts Suppose we want to study a population, and the parameter (average or percentage) is unknown. We use ____ to estimate the parameter. With a simple random sample, the sample average/percentage can be used to estimate the population average/percentage. But the sample estimate will be off by some amount, due to chance error. The standard error measures the likely size of it. When the composition of the population is unknown, we have to use the bootstrap method to estimate the SD of the population. What is the bootstrap method? The SD of the population can be estimated by the SD of the sample. This bootstrap estimate is good when the sample is large.

Basic concepts What is a confidence interval for the population parameter (average or percentage)? A confidence interval for the population parameter is obtained by going the right number of SEs either way from the sample estimate. The confidence level is read off the normal curve. This should only be used with large samples due to the CLT. How do you interpret the confidence level in terms of frequency theory of probability? It is not about the probability that the parameter lies in the interval. Because parameters are not subject to chance variation. It states about the frequency of multiple samples that the corresponding confidence interval covers the true value (parameter).

Basic concepts The formulas for simple random samples may not apply to other kinds of samples. For instance: with samples of convenience, standard errors usually do not make sense. Even if the sample is drawn by probability method, but not simple random sampling, the formula for SE is still not applied.

Basic concepts What is a test of significance? What is the null hypothesis, and what is the alternative hypothesis? A test of significance gets at the question of whether an observed difference is real (the alternative hypothesis) or just a chance variation (the null hypothesis). The null must be based on the chance process (assuming no other factors or bias), and the alternative is based on the question/argument we suggest. We can use a test of significance to detect a statement (null), or prove a statement (alternative).

Basic concepts What is a test statistic? A test statistic measures the difference between the data and what is expected based on the null hypothesis. This means the calculation is based on the null. What is a z-statistic? The z-statistic says how many SEs away an observed value is from its expected value, where the expected value is calculated using the null hypothesis.

Basic concepts What is the observed significance level or P-value? How do you interpret it? The P-value is not the chance of the null being correct. It is the chance of getting a test statistic as extreme as or more extreme than the observed one. (The calculation is based on the null.) Small P-values are evidence against the null: Less than 5%: statistically significant or significant. Less than 1%: highly significant.

Basic concepts Suppose we only have a small sample, say the sample size is 5. If the observed values (or the errors) follow the normal curve, and the SD of the population is unknown. Do we still use the z-test? No. We use the t-test instead. Suppose we have a randomized control experiment. We want to compare the data from the treatment group and the control group. In order to prove the treatment indeed has effect, what kind of test shall we use? How do we set up the null and alternative? We use two-sample z-test. The null is based on the chance variation. So it says there is no effect on the treatment. The alternative is based on what we want to prove: the treatment has effect.

Basic concepts Suppose we want to detect whether a coin is fair or not. What kind of test shall we use? The one-sample z-test (with two-sided P-value) or the χ²-test. But what if we want to detect a die is fair or not? (More than 2 categories.) We use the χ²-test. The χ²-statistic is always positive. (Compare to the z-statistic.) The χ²-test can also be used to test for independence.

Calculation and formula

Example 1 A survey organization takes a simple random sample of 625 households from a city of 80,000 households. On the average, there are 2.30 persons per sample household, and the SD is Find a 95%-confidence interval for the average household size in the city.

Solution

Remark A variant of this problem could be: Suppose 30% of the sample households have the size greater or equal to 3 persons. Find a 95%-confidence interval for the percentage of the households having the size greater or equal to 3 persons in the city. In this case, you are doing a 0-1 box problem. You may also look at the statement (true or false): 95% of the households in the city contain between 2.16 and 2.44 persons. This is false. It confuses the SD with the SE. SE measures the chance error for multiple samples, SD measures the spread of the data for just one sample.

Example 2 According to the census, the median household income in Atlanta (1.5 million households) was $52,000 in In June 2003, a market research organization takes a simple random sample of 750 households in Atlanta; 56% of the sample households had incomes over $52,000. Did median household income in Atlanta increase over the period 1999 to 2003? Formulate the null and alternative hypotheses, and use a test of significant to detect the statement.

Solution This problem asks about whether the median increased or not. But we don’t have enough information about the incomes overall. Even if we know the observed median in 2003, we still don’t know how to compute the corresponding SE. So instead of looking at the incomes (quantitative variable), we look at the qualitative variable: whether the a household had income over $52,000 or not. The idea is that, since the median (50%) income in 1999 was $52,000, if the percentage of households having income over $52,000 was really greater than 50% in 2003 (not due to chance), then the median must increase.

Solution So a 0-1 box is needed (to classify the qualitative data): The box has one ticket for each household in If the income is over $52,000, the ticket is marked 1; otherwise, 0. The null says: the median did not increase, or equivalently, the percentage of the households having incomes over $52,000 is 50%. (The percentage of 1’s in the box is 50%.) The alternative says, this percentage is bigger than 50%. (The median did increase.) The sample is just like 750 draws from the box.

Solution

Good Luck!