Download presentation
Presentation is loading. Please wait.
1
Ordinary Least Squares (OLS) Regression
What is it? Closely allied with correlation – interested in the strength of the linear relationship between two variables One variable is specified as the dependent variable The other variable is the independent (or explanatory) variable
2
Regression Model Y = a + bx + e What is Y? What is a? What is b?
What is x? What is e?
3
Elements of the Regression Line
a = Y intercept (what Y is predicted to equal when X = 0) b = Slope (indicates the change in Y associated with a unit increase in X) e = error (the difference between the predicted Y (Y hat) and the observed Y
4
Regression Has the ability to quantify precisely the relative importance of a variable Has the ability to quantify how much variance is explained by a variable(s) Use more often than any other statistical technique
5
The Regression Line Y = a + bx + e Y = sentence length
X = prior convictions Each point represents the number of priors (X) and sentence length (Y) of a particular defendant The regression line is the best fit line through the overall scatter of points
6
X and Y are observed. We need to estimate a & b
7
Calculus 101 Least Squares Method and differential calculus
Differentiation is a very powerful tool that is used extensively in model estimation. Practical examples of differentiation are usually in the form of minimization/optimization problems or rate of change problems.
9
How do you draw a line when the line can be drawn in almost any direction?
The Method of Least Squares: drawing a line that minimizing the squared distances from the line (Σe2) This is a minimization problem and therefore we can use differential calculus to estimate this line.
10
Least Squares Method x y Deviation =y-(a+bx) d2 1 1 - a (1 - a)2
1 1 - a (1 - a)2 1-2a+a2 3 3 - a - b (3 - a - b)2 9 - 6a + a2 - 6b + 2ab + b2 2 2 - a - 2b (2 - a - 2b)2 4 - 4a - a2 - 8b + 4ab + 4b2 4 4 - a - 3b (4 - a - 3b)2 16 - 8a + a2 - 24b + 6ab +9b2 5 5 - a - 4b (5 - a - 4b)2 a +a2 -40b +8ab +16b2
11
Summing the squares of the deviations yields:
f(a, b) = 55-30a + 5a2 - 78b + 20ab + 30b2 Calculate the first order partial derivatives of f(a,b) fb = a + 60b and fa = a + 20b
12
Set each partial derivative to zero:
Manipulate fa: 0 = a + 20b 10a = b a= 3 - 2b
13
Substitute (3-2b) into fb:
0 = a + 60b = (3-2b) + 60b = b + 60b = b 20b = 18 b = 0.9 Slope = .09
14
Substituting this value of b back into fa to obtain a:
Y-intercept = 1.2
15
Estimating the model (the easy way)
Calculating the slope (b)
16
Sum of Squares for X Some of Squares for Y Sum of products
17
We’ve seen these values before
18
Regression is strongly related to Correlation
19
Calculating the Y-intersept (a)
Calculating the error term (e) Y hat = predicted value of Y e will be different for every observation. It is a measure of how much we are off in are prediction.
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.