Statistical Techniques I EXST7005 Simple Linear Regression.

Slides:



Advertisements
Similar presentations
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Advertisements

Regression and correlation methods
Kin 304 Regression Linear Regression Least Sum of Squares
The Simple Regression Model
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
11 Simple Linear Regression and Correlation CHAPTER OUTLINE
Probabilistic & Statistical Techniques Eng. Tamer Eshtawi First Semester Eng. Tamer Eshtawi First Semester
Simple Linear Regression. G. Baker, Department of Statistics University of South Carolina; Slide 2 Relationship Between Two Quantitative Variables If.
Correlation and Regression
Definition  Regression Model  Regression Equation Y i =  0 +  1 X i ^ Given a collection of paired data, the regression equation algebraically describes.
1 Simple Linear Regression Chapter Introduction In this chapter we examine the relationship among interval variables via a mathematical equation.
REGRESSION AND CORRELATION
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
C82MCP Diploma Statistics School of Psychology University of Nottingham 1 Linear Regression and Linear Prediction Predicting the score on one variable.
Simple Linear Regression and Correlation
Simple Linear Regression Analysis
Chapter 2 – Simple Linear Regression - How. Here is a perfect scenario of what we want reality to look like for simple linear regression. Our two variables.
Ordinary Least Squares
Correlation & Regression
McGraw-Hill/Irwin Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 13 Linear Regression and Correlation.
Objectives of Multiple Regression
Linear Regression.
Introduction to Linear Regression and Correlation Analysis
Regression Analysis Regression analysis is a statistical technique that is very useful for exploring the relationships between two or more variables (one.
Chapter 11 Simple Regression
Linear Regression and Correlation
1 FORECASTING Regression Analysis Aslı Sencer Graduate Program in Business Information Systems.
Chapter 6 & 7 Linear Regression & Correlation
Ch4 Describing Relationships Between Variables. Pressure.
Section 5.2: Linear Regression: Fitting a Line to Bivariate Data.
Statistical Methods Statistical Methods Descriptive Inferential
Warsaw Summer School 2015, OSU Study Abroad Program Regression.
Simple Linear Regression. The term linear regression implies that  Y|x is linearly related to x by the population regression equation  Y|x =  +  x.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 13-1 Introduction to Regression Analysis Regression analysis is used.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
1 Simple Linear Regression and Correlation Least Squares Method The Model Estimating the Coefficients EXAMPLE 1: USED CAR SALES.
OLS Regression What is it? Closely allied with correlation – interested in the strength of the linear relationship between two variables One variable is.
Chapter 14 Introduction to Regression Analysis. Objectives Regression Analysis Uses of Regression Analysis Method of Least Squares Difference between.
BUSINESS MATHEMATICS & STATISTICS. Module 6 Correlation ( Lecture 28-29) Line Fitting ( Lectures 30-31) Time Series and Exponential Smoothing ( Lectures.
1 AAEC 4302 ADVANCED STATISTICAL METHODS IN AGRICULTURAL RESEARCH Part II: Theory and Estimation of Regression Models Chapter 5: Simple Regression Theory.
Statistics 350 Lecture 2. Today Last Day: Section Today: Section 1.6 Homework #1: Chapter 1 Problems (page 33-38): 2, 5, 6, 7, 22, 26, 33, 34,
Bivariate Regression. Bivariate Regression analyzes the relationship between two variables. Bivariate Regression analyzes the relationship between two.
STA302/1001 week 11 Regression Models - Introduction In regression models, two types of variables that are studied:  A dependent variable, Y, also called.
Linear Regression Essentials Line Basics y = mx + b vs. Definitions
The simple linear regression model and parameter estimation
Summarizing Descriptive Relationships
Regression Analysis AGEC 784.
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Lecture #26 Thursday, November 17, 2016 Textbook: 14.1 and 14.3
Part 5 - Chapter
Part 5 - Chapter 17.
Kin 304 Regression Linear Regression Least Sum of Squares
Linear Regression and Correlation Analysis
Simple Linear Regression - Introduction
Correlation and Simple Linear Regression
Correlation and Regression
Part 5 - Chapter 17.
6-1 Introduction To Empirical Models
Regression Models - Introduction
Simple Linear Regression
Statistical Assumptions for SLR
Correlation and Simple Linear Regression
Simple Linear Regression and Correlation
Least Square Regression
Simple Linear Regression
Regression & Correlation (1)
Introduction to Regression
Summarizing Descriptive Relationships
Regression Models - Introduction
Presentation transcript:

Statistical Techniques I EXST7005 Simple Linear Regression

n Measuring & describing a relationship between two variables è Simple Linear Regression allows a measure of the rate of change of one variable relative to another variable. è Variables will always be paired, one termed an independent variable (often referred to as the X variable) and a dependent variable (termed a Y variable). è There is a change in the value of variable Y as the value of variable X changes.

Simple Linear Regression (continued) n For each value of X there is a population of values for the variable Y (normally distributed). Y X

Simple Linear Regression (continued) n The linear model which discribes this relationship is given as è Yi = b0 + b1Xi è this is the equation for a straight line è where; b0 is the value of the intercept (the value of Y when X = 0) è b1 is the amount of change in Y for each unit change in X. (i.e. if X changes by 1 unit, Y changes by b1 units). b1 is also called the slope or REGRESSION COEFFICIENT

Simple Linear Regression (continued) n Population Parameters   y.x = the true population mean of Y at each value of X   0 = the true value of the Y intercept   1 = the true value of the slope, the change in Y per unit of X   y.x =  0 +  1Xi è this is the population equation for a straight line

Simple Linear Regression (continued) n The sample equation for the line describes a perfect line with no variation. In practice there is always variation about the line. We include an additional term to represent this variation.  y.x =  0 +  1Xi +  i for a population n Yi = b0 + b1Xi + ei for a sample è when we put this term in the model, we are describing individual points as their position on the line, plus or minus some deviation

Simple Linear Regression (continued) Y X n

n the SS of deviations from the line will form the basis of a variance for the regression line n when we leave the ei off the sample model, we are describing a point on the regression line predicted from the sample. To indicate this we put a HAT on the Yi value Simple Linear Regression (continued)

Characteristics of a Regression Line The line will pass through the point  X,  Y (also the point 0, b0) n The sum of squared deviations (measured vertically) of the points from the regression line will be a minimum. n Values on the line can be described by the equation Y = b0 + b1Xi

Fitting the line n Fitting the line starts with a corrected SSDeviation, this is the SSDeviation of the observations from a horizontal line through the mean. Y X

Fitting the line (continued) n The fitted line is pivoted on the point until it has a minimum SSDeviations. Y X

Fitting the line (continued) How do we know the SSDeviations are a minimum? Actually, we solve the equation for ei, and use calculus to determine the solution that has a minimum of  ei2.

Fitting the line (continued) n The line has some desirable properties  E(b0) =  0  E(b1) =  1  E(  YX) =  X.Y n Therefore, the parameter estimates and predicted values are unbiased estimates.

The regression of Y on X n Y = the "dependent" variable, the variable to be predicted n X = the "independent" variable, also called the regressor or predictor variable. n Assumptions - general assumptions è Y variable is normally distributed at each value of X è The variance is homogeneous (across X). è Observations are independent of each other and ei independent of the rest of the model.

The regression of Y on X (continued) n Special assumption for regression. n Assume that all of the variation is attributable to the dependent variable (Y), and that the variable X is measured WITHOUT ERROR. n Note that the deviations are measured vertically, not horizontally or perpendicular to the line.

Derivation of the formulas n Any observation can be written as è Yi = b0 + b1Xi + ei for a sample è where; ei = a deviation fo the observed point from the regression line n note, the idea of regression is to minimize the deviation of the observations from the regression line, this is called a Least Squares Fit

Derivation of the formulas (continued)  ei = 0 n the sum of the squared deviations   ei2 =  (Yi - Yhat)2   ei2 =  (Yi - b0 + b1Xi )2 The objective is to select b0 and b1 such that  ei2 is a minimum, this is done with calculus n You do not need to know this derivation!

A note on calculations n We have previously defined the uncorrected sum of squares and corrected sum of squares of a variable Yi The uncorrected SS is  Yi2 The correction factor is (  Yi)2/n The corrected SS is  Yi2 - (  Yi)2/n n Your book calls this SYY, the correction factor is CYY n We could define the exact same series of calculations for Xi, and call it SXX

A note on calculations (continued) n We will also need a crossproduct for regression, and a corrected crossproduct n The crossproduct is XiYi The Sum of crossproducts is  XiYi, which is uncorrected The correction factor is (  Xi)(  Yi) / n = CXY The corrected crossproduct is  XiYi-(  Xi)(  Yi)/n n Which you book calls SXY

n the partial derivative is taken with respect to each of the parameters for b0 Derivation of the formulas (continued)

n set the partial derivative to 0 and solve for b0  2  (Yi-b0-b1Xi)(-1) = 0  -  Yi + nb0 + b1  Xi = 0  nb0 =  Yi - b1  Xi  b0 =  Y - b1  X n So b0 is estimated using b1 and the means of X and Y

Derivation of the formulas (continued) n Likewise for b1 we obtain the partial derivative

Derivation of the formulas (continued) n set the partial derivative to 0 and solve for b1  2  (Yi-b0-b1Xi)(-Xi) = 0  -  (YiXi + b0Xi + b1 Xi2) = 0  -  YiXi + b0  Xi + b1  Xi2) = 0  and since b0 =  Y - b1  X ), then   YiXi = (  Yi/n - b1  Xi/n )  Xi + b1  Xi2   YiXi =  Xi  Yi/n - b1 (  Xi)2/n + b1  Xi2   YiXi -  Xi  Yi/n = b1 [  Xi2 - (  Xi)2/n]  b1 = [  YiXi -  Xi  Yi/n] / [  Xi2 - (  Xi)2/n]

Derivation of the formulas (continued) b1 = [  YiXi -  Xi  Yi/n] / [  Xi2 - (  Xi)2/n] n b1 = SXY / SXX n so b1 is the corrected crossproducts over the corrected SS of X The intermediate statistics needed to solve all elements of a SLR are  Xi,  Yi, n,  Xi2,  YiXi and  Yi2 (this last term we haven't seen in the calculations above, but we will need later)

Derivation of the formulas (continued) n Review n We want to fit the best possible line, we define this as the line that minimizes the vertically measured distances from the observed values to the fitted line. n The line that achieves this is defined by the equations  b0 =  Y - b1  X  b1 = [  YiXi -  Xi  Yi/n] / [  Xi2 - (  Xi)2/n]

Derivation of the formulas (continued) n These calculations provide us with two parameter estimates that we can then use to get the equation for the fitted line.

Numerical example n See Regression handout

About Crossproducts n Crossproducts are used in a number of related calculations. n a crossproduct = YiXi Sum of crossproducts =  YiXi = SXY Covariance =  YiXi / (n-1) n Slope = SXY / SXX n SSRegression = S2XY / SXX Correlation = SXY /  SXXSYY n R2 = r2 = S2XY / SXXSYY = SSRegression/SSTotal