Section 9-1: Inference for Slope and Correlation Section 9-3: Confidence and Prediction Intervals Visit the Maths Study Centre.

Slides:



Advertisements
Similar presentations
Chap 12-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 12 Simple Regression Statistics for Business and Economics 6.
Advertisements

Forecasting Using the Simple Linear Regression Model and Correlation
10-3 Inferences.
Inference for Regression
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
Chapter 12 Simple Regression
Linear Regression and Correlation
Chapter Topics Types of Regression Models
More Simple Linear Regression 1. Variation 2 Remember to calculate the standard deviation of a variable we take each value and subtract off the mean and.
© 2000 Prentice-Hall, Inc. Chap Forecasting Using the Simple Linear Regression Model and Correlation.
Simple Linear Regression and Correlation
Review for Exam 2 Some important themes from Chapters 6-9 Chap. 6. Significance Tests Chap. 7: Comparing Two Groups Chap. 8: Contingency Tables (Categorical.
Introduction to Regression Analysis, Chapter 13,
Simple Linear Regression Analysis
Statistical hypothesis testing – Inferential statistics II. Testing for associations.
Correlation & Regression
4.1 Introducing Hypothesis Tests 4.2 Measuring significance with P-values Visit the Maths Study Centre 11am-5pm This presentation.
Estimation and Hypothesis Testing Faculty of Information Technology King Mongkut’s University of Technology North Bangkok 1.
Introduction to Linear Regression and Correlation Analysis
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 12 Analyzing the Association Between Quantitative Variables: Regression Analysis Section.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 12-1 Chapter 12 Simple Linear Regression Statistics for Managers Using.
Maths Study Centre CB Open 11am – 5pm Semester Weekdays
Confidence Intervals and Hypothesis Testing - II
Fundamentals of Hypothesis Testing: One-Sample Tests
Copyright © Cengage Learning. All rights reserved. 13 Linear Correlation and Regression Analysis.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 14 Comparing Groups: Analysis of Variance Methods Section 14.2 Estimating Differences.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
Correlation and Regression
More About Significance Tests
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
Statistics for Business and Economics 7 th Edition Chapter 11 Simple Regression Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch.
● Final exam Wednesday, 6/10, 11:30-2:30. ● Bring your own blue books ● Closed book. Calculators and 2-page cheat sheet allowed. No cell phone/computer.
Introduction to Linear Regression
Production Planning and Control. A correlation is a relationship between two variables. The data can be represented by the ordered pairs (x, y) where.
Chapter 11 Inference for Tables: Chi-Square Procedures 11.1 Target Goal:I can compute expected counts, conditional distributions, and contributions to.
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 13 Multiple Regression Section 13.3 Using Multiple Regression to Make Inferences.
Maths Study Centre CB Open 11am – 5pm Semester Weekdays
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Introduction to Inference: Confidence Intervals and Hypothesis Testing Presentation 8 First Part.
Introduction to Inference: Confidence Intervals and Hypothesis Testing Presentation 4 First Part.
Data Analysis.
Chap 8-1 Fundamentals of Hypothesis Testing: One-Sample Tests.
Chapter 13 Inference for Counts: Chi-Square Tests © 2011 Pearson Education, Inc. 1 Business Statistics: A First Course.
Agresti/Franklin Statistics, 1 of 88 Chapter 11 Analyzing Association Between Quantitative Variables: Regression Analysis Learn…. To use regression analysis.
26134 Business Statistics Tutorial 11: Hypothesis Testing Introduction: Key concepts in this tutorial are listed below 1. Difference.
Logic and Vocabulary of Hypothesis Tests Chapter 13.
Chapter 14: Inference for Regression. A brief review of chapter 4... (Regression Analysis: Exploring Association BetweenVariables )  Bi-variate data.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 11 Analyzing the Association Between Categorical Variables Section 11.2 Testing Categorical.
Statistical Inference Drawing conclusions (“to infer”) about a population based upon data from a sample. Drawing conclusions (“to infer”) about a population.
Chapter 13- Inference For Tables: Chi-square Procedures Section Test for goodness of fit Section Inference for Two-Way tables Presented By:
26134 Business Statistics Week 4 Tutorial Simple Linear Regression Key concepts in this tutorial are listed below 1. Detecting.
Jump to first page Inferring Sample Findings to the Population and Testing for Differences.
Objectives (BPS chapter 12) General rules of probability 1. Independence : Two events A and B are independent if the probability that one event occurs.
Hypothesis Tests Hypothesis Tests Large Sample 1- Proportion z-test.
Hypothesis Tests for 1-Proportion Presentation 9.
Chapter 11: Categorical Data n Chi-square goodness of fit test allows us to examine a single distribution of a categorical variable in a population. n.
Class Seven Turn In: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 For Class Eight: Chapter 20: 18, 20, 24 Chapter 22: 34, 36 Read Chapters 23 &
26134 Business Statistics Week 4 Tutorial Simple Linear Regression Key concepts in this tutorial are listed below 1. Detecting.
Chapter 14 Inference on the Least-Squares Regression Model and Multiple Regression.
Inference for Regression (Chapter 14) A.P. Stats Review Topic #3
MATH1005 STATISTICS Tutorial 3: Bivariate Data.
26134 Business Statistics Week 5 Tutorial
Inference for Regression
Review of Hypothesis Testing
Chapter 10 Analyzing the Association Between Categorical Variables
When You See (This), You Think (That)
Analyzing the Association Between Categorical Variables
Presentation transcript:

Section 9-1: Inference for Slope and Correlation Section 9-3: Confidence and Prediction Intervals Visit the Maths Study Centre 11am-5pm CB This presentation is viewable on Check out the website

Statistical inference is the process of drawing conclusions about the entire population based on information in a sample by: constructing confidence intervals on population parameters or by setting up a hypothesis test on a population parameter

Regression The linear regression line characterises the relationship between two quantitative variables. Using regression analysis on data can help us draw insights about that data. It helps us understand the impact of one of the variables on the other. It examines the relationship between one independent variable (predictor/explanatory) and one dependent variable (response/outcome). The linear regression line equation is based on the equation of a line in mathematics. β0+β1Xβ0+β1X

X: Predictor Variable Explanatory Variable Independent Variable Variable one can control. Y: Outcome variable Response Variable Dependent Variable The outcome to be measured/predicted.

General: Hypothesis Testing We use hypothesis testing to infer conclusions about the population parameters based on analysing the statistics of the sample. In statistics, a hypothesis is a statement about a population parameter. 1. The null hypothesis, denoted H 0 is a statement or claim about a population parameter that is initially assumed to be true. No “effect” or no “difference”. Is always an equality. (Eg. H 0 : population parameter=hypothesised null parameter) 2. The alternative hypothesis, denoted by H 1 is the competing claim. What we are trying to prove. Claim we seek evidence for. (Eg. H 1 : population parameter ≠ or hypothesised null parameter) 3. Test Statistic: a measure of compatibility between the statement in the null hypothesis and the data obtained. 4. Decision Criteria: The P-value is the probability of obtaining a test statistic as extreme or more extreme than the observed sample value given H 0 is true. If p-value≤0.05 reject H o If p-value>0.05 do not reject H o 5. Conclusion: Make your conclusion in context of the problem.

Hypothesis Test for Correlation Coefficient Correlation is not significant. Correlation is significant. Correlation measures the strength of the linear association between two variables. Sample Correlation: r=

Interpretations of slope INTERPRETATION OF THE SLOPE The slope ‘β’ represents the predicted change in the response variable y given a one unit increase in the explanatory variable x. As the “independent variable” increases by 1 unit, the predicted “dependent variable” increases/decreases by β units on average. Y= a+β * X

H 0 : β=0. There is no association between the response variable and the independent variable. (Regression is insignificant) y= α + 0*X H 1 : β≠0. The independent variables will affect the response variable. (Regression is significant) y= α + βX If p-value≤ We reject H0. There is evidence that β≠0, which means that the independent variable is an effective predictor of the dependent variable, at the 5% level of significance. If p-value >0.05. We do not reject Ho. There is no evidence that β≠0, which means that the independent variable is NOT an effective predictor of the dependent variable, at the 5% level of significance. Hypothesis Test for Slope

Confidence Interval for Slope

11

Coefficient for determination R 2 R-squared gives us the proportion of the total variability in the response variable (Y) that is “explained” by the least squares regression line based on the predictor variable (X). It is usually stated as a percentage. — —Interpretation: On average, R 2 % of the variation in the dependent variable can be explained by the independent variable through the regression model.

Confidence Intervals and Prediction Intervals The key point is that the prediction interval tells you about the distribution of values, not the uncertainty in determining the population mean. Prediction intervals must account for both the uncertainty in knowing the value of the population mean, plus data scatter. So a prediction interval is always wider than a confidence interval.

REVISION

Statistical inference is the process of drawing conclusions about the entire population based on information in a sample by: constructing confidence intervals on population parameters or by setting up a hypothesis test on a population parameter

General: Hypothesis Testing We use hypothesis testing to infer conclusions about the population parameters based on analysing the statistics of the sample. In statistics, a hypothesis is a statement about a population parameter. 1. The null hypothesis, denoted H 0 is a statement or claim about a population parameter that is initially assumed to be true. No “effect” or no “difference”. Is always an equality. (Eg. H 0 : population parameter=hypothesised null parameter) 2. The alternative hypothesis, denoted by H 1 is the competing claim. What we are trying to prove. Claim we seek evidence for. (Eg. H 1 : population parameter ≠ or hypothesised null parameter) 3. Test Statistic: a measure of compatibility between the statement in the null hypothesis and the data obtained. 4. Decision Criteria: The P-value is the probability of obtaining a test statistic as extreme or more extreme than the observed sample value given H 0 is true. If p-value≤0.05 reject H o If p-value>0.05 do not reject H o 5. Conclusion: Make your conclusion in context of the problem.

Hypothesis Testing for a single mean Ho: μ=null parameter Ha: μ≠null parameter or μ null parameter Test Statistic: If p-vale<0.05. We reject the H0. Conclude that we have enough evidence to prove the alternative hypothesis is true at the 5% level of significance. If p-vale≥0.05. We reject the H0. Conclude that we have enough evidence to prove the alternative hypothesis is true at the 5% level of significance.

Hypothesis Testing for Difference in Means (2 independent samples)

Hypothesis Testing for Paired Difference in Means

Hypothesis Testing for 1 Proportion

Hypothesis Testing for Difference in 2 Proportions

Hypothesis Testing for a Single Categorical Variable (Goodness of Fit test) H o : Hypothesised proportions for each category p i =…. H a : At least one p i is different. Test Statistic: Calculate the expected counts for each cell as np i. Make sure they are all greater than 5 to proceed. Calculate the chi-squared statistic: Find p-value as the area in the tail to the right of the chi- squared statistic (always select right tail) for a chi-squared distribution with df=(# of categories-1) and compare to significance level α=0.05. If p-value< α, reject H o. Conclude that we have evidence to prove the alternative is true at the α % level of significance. If p-value≥ α, do not reject H o. Conclude that we do not have enough evidence to prove the alternative is true at the α % level of significance. α is 5% by default unless stated otherwise

Hypothesis Testing for an Association between two categorical variables H o : The two variables are not associated H a : The two variables are associated Test Statistic: Calculate the expected counts for each cell as (row total*column total)/n. Make sure they are all greater than 5 to proceed. Calculate the chi-squared statistic: Find p-value as the area in the tail to the right of the chi- squared statistic (always select right tail) for a chi-squared distribution with df=(r-1)*(c-1) and compare to significance level α=0.05. If p-value< α, reject H o. Conclude that we have evidence to prove the alternative hypothesis (in context of the question) is true at the α % level of significance. If p-value≥ α, do not reject H o. Conclude that we do not have enough evidence to prove the alternative hypothesis (in context of the question) is true at the α % level of significance. α is 5% by default unless stated otherwise

Hypothesis Testing for the difference in means of more than two samples (ANOVA – Analysis of Variance test) Condition met when the ratio of the largest standard deviation to the smallest is less than 2. The assumption of equal variances hold and therefore it is appropriate to use the ANOVA table when testing the difference in the means. H o : μ 1 =μ 2 =μ 3 or H o : The means are equal to each other H a : μ 1 ≠μ 2 ≠μ 3 or H o : At least one mean is different Construct an ANOVA table to calculate the F-Test Statistic based on your sample data: F-statistic=MSG/MSE Find p-value as the area in the tail to the right of the F-statistic (always select right tail) for a F distribution with df=(k-1)/(n-k) where k=no of categories and n=no of samples and compare to significance level α=0.05. If p-value< α, reject H o. Conclude that we have evidence to prove the alternative hypothesis (in context of the question) is true at the α % level of significance. If p-value≥ α, do not reject H o. Conclude that we do not have enough evidence to prove the alternative hypothesis (in context of the question) is true at the α % level of significance. α is 5% by default unless stated otherwise