Presentation on theme: "Quantitative Methods 2 Lecture 3 The Simple Linear Regression Model Edmund Malesky, Ph.D., UCSD."— Presentation transcript:
Quantitative Methods 2 Lecture 3 The Simple Linear Regression Model Edmund Malesky, Ph.D., UCSD
What is Ordinary Least Squares? 4 Ordinary Least Squares (OLS) finds the linear model that minimizes the sum of the squared errors. 4 Such a model provides the best explanation/prediction of the data. 4 Later we’ll show that OLS is the “Best Linear Unbiased Estimator” (BLUE)
“Explained and “Unexplained” Variation XiXiXiXi yiyiyiyi
XiXiXiXi yiyiyiyi Square this quantity and sum across all observations and we have our SST (Total Sum of Squares) Square this quantity and sum across all observations and we have our SSE (Explained Sum of Squares) Square this quantity and sum across all observations and we have our SSR (Residual Sum of Squares)
Some Useful Properties of Summation
Proofs of 7 and 8 in Appendix A of Wooldridge
Minimizing the Sum of Squared Errors 4 How to put the Least in OLS? 4 In mathematical jargon we seek to minimize the residual sum of squares (SSR), where:
Picking the Parameters 4 To Minimize SSR, we need parameter estimates. 4 In calculus, if you wish to know when a function is at its minimum, you take the first derivative. 4 In this case we must take partial derivatives since we have two parameters (β 0 & β 1 ) to worry about.
Parameters that Minimize SSR
Minimize the Squared Errors 4 The SSR Function is: Substitute in our equation for yhat.
Here comes the magic, Baby! u Simplify Terms Partial Derivative with respect to β 0 Partial Derivative with respect to β 1
Simplify Terms Separate terms First, Outside, Inside, Last (F.O.I.L) Multiply -2y i through F.O.I.L (A-B) 2= A 2 -BA-AB+B 2 (A-B) 2= A 2 -2AB+B 2
Partial Derivative with respect to β 0 Take the derivative only of terms which include β 0 Simplify
Partial Derivative with respect to β 1 Take the derivative only of terms which include β 1 Simplify
Partial Derivatives for β 0 and β 1 4 First equation is the partial derivative with respect to β 0 4 Second equation is with respect to β 1
Simplify and Set Equal to Zero 4 First equation is for β 0, second is for β 1 4 Set = 0 to find minimum point 4 (Hats denote that parameters are estimates)
The Normal Equations Divide equation 1 by -2 and equation 2 by 2 Multiply through by -x in β 1 ’s equation Separate summation terms and rearrange to yield:
Solving the Normal Equations 4 Now we have two equations with two unknown terms: β 0 and β 1 4 These can be solved using algebra to calculate the values of both β 0 and β 1
Solving for β 1 4 Multiply first normal equation by the sum of x i 4 Multiply second normal equation by n.
Still Solving for β 1 … 4 Now subtract first equation from the second 4 This yields:
Still Solving for β 1 … 4 Terms of cancel one another out 4 Then we factor out β 1 from both terms on the right-hand side 4 Then divide through by the quantity on the right hand side to yield:
A Solution for β 1 4 Now multiply numerator & denominator by 1/n 4 Recall that: 4 This yields: Tricky: Multiply both sides by 1/n*1=1/n*n/n=n/n 2 Why? I need to multiply three separate numbers. I can’t simply split the 1/n. Imagine I wanted to multiply ½(5*10)=25. I can’t solve it by multiplying 1/2(5)*1/2(10), which equals That is like multiplying by ¼. I need to multiply by ½*1=1/2*2/2=2/4. Now, 2(1/2*5)(1/2*10)=25
Now Solving for β o 4 Take the first normal equation 4 Then divide both sides by n and rearrange to yield:
A Solution for β o 4 Now again that recall that: 4 Thus:
But What Does It Mean? 4 Equation for β 1 may not seem to make intuitive sense at first 4 But if we break it down into pieces, we can begin to see the logic
Understanding what makes β 1 4 Numerator for β 1 is made of of TWO parts –Deviations of X from its mean –Deviations of Y from its mean –Then we multiply those deviations –And sum them up across all observations We know this as…. Covariance.
Understanding What Makes β 1 4 Denominator of β 1 is made up of the deviation of x from its mean times itself 4 We square this term. 4 And sum up across all observations We know this as…. Variance in the Independent Variable
Understanding What Makes β 1 4 Thus β 1 is made of of changes in x times changes in y, divided by changes in x squared –A.K.A “rise over run” 4 Notice if the changes in x are EQUAL to the changes in y, then β 1 = 1
Understanding What Makes β 1 4 If the changes in y are LARGER than the changes in x, then β 1 > 1 –I.E. a 1 unit change in x creates more than a 1 unit change in y 4 If the changes in y are SMALLER than the changes in x, then β 1 < 1 –I.E. a 1 unit change in x creates less than a 1 unit change in y
Understanding What Makes β 1 4 This corresponds to our intuitive understanding of the slope of a line –How much change in y do we observe for each change in x? 4 We can also see how β 1 is calculated in units of the dependent variable. –It is changes in the dependent variable over changes in the independent variable
Let’s Do An Example!
Calculating β 0 & β 1 4 Mean of x is 4 4 Mean of y is 14
Calculating β 0 & β 1 = 186 = 62 β 1 = 3
Calculating β o and β 1 4 β o = mean of y - β 1 (mean of x) 4 Recall that: –mean of y = 14 & mean of x = 4 4 β o = (4) 4 β o = 2 4 Our equation is: y = 2 + 3x
Which Looks Like…This!
Calculating R 2 4 Let’s return to SSR 4 Plug in β o and solve to get SST SSE SSR
Calculating R 2 4 R 2= SSE/SST y - mean(y-mean) Our model perfectly explains variation in y.