Presentation on theme: "Geometric Representation of Regression. ‘Multipurpose’ Dataset from class website Attitude towards job –Higher scores indicate more unfavorable attitude."— Presentation transcript:
‘Multipurpose’ Dataset from class website Attitude towards job –Higher scores indicate more unfavorable attitude toward company Number of years worked Days absent 12 cases EMP DAYSABS ATTRATE YEARS a 1 1 1 b 0 2 1 c 1 2 2 d 4 3 2 e 3 5 4 f 2 5 6 g 5 6 5 h 6 7 4 i 9 10 8 j 13 11 7 k 15 11 9 l 16 12 10
Typical representation with response surface Correlations.89 and up* R 2 model =.903 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -2.2630 1.0959 -2.065 0.0689. ATTRATE 1.5497 0.4805 3.225 0.0104 * YEARS -0.2385 0.6064 -0.393 0.7032 DAYSABS ATTRATE YEARS DAYSABS 1.0000000 0.9497803 0.8902164 ATTRATE 0.9497803 1.0000000 0.9505853 YEARS 0.8902164 0.9505853 1.0000000
Typical representation with response surface Where the response surface crosses the y axis (daysabs) provides the intercept in our formula Holding a variable ‘constant’ is like adding a plane perpendicular to that variable’s axis The process as a whole minimizes the sum of the squared distances between the original data points and their projection onto the plane
Alternative Given a variable, we can instead view it as a vector projection from an origin into some n-dimensional space In another way, the space is the number of dimensions, one for each individual (for this data 12 dimensions), where this vector, which represents their values on some predictor, occupies only a single dimension within that space
Assume now two standardized variables of equal N Now we have 2 vectors (of N components) emanating from the origin* The cosine of the angle they create is the simple correlation of the two variables If they were perfectly correlated they would occupy the same dimension (i.e. be right on top of one another) X1 X2
X1 X2 Y Adding a third variable, we can again understand their simple correlations as the cosines of the respective angles they create Given the plane created by X1 and X2, might we find a way to project Y onto it?
X1 X2 Y That is in fact what multiple regression does and this projection is that linear combination* resulting in our predicted values The cosine of the angle created by Y and Y-hat is the multiple R, which when squared gives the amount of variance in Y accounted for by the model containing X1 and X2 The attempt is made in regression to minimize that angle/max its cosine Partial correlations may be represented too, by creating a plane perpendicular** to one variable and projecting the others onto that plane The cosine of the angle they create will be their partial correlation Y-hat