Download presentation
Presentation is loading. Please wait.
Published byJuan Carlos Pereyra Cárdenas Modified over 6 years ago
1
Covariance x – x > 0 x (x,y) y – y > 0 y x and y axes
2
Covariance x – x < 0 x (x,y) y – y > 0 y x and y axes
3
Covariance So what happens on balance? x
Below average values of x are with above average values of y Above average values of x are also above average values of y So what happens on balance? y Below average values of x are also below average values of y Above average values of x are with below average values of y
4
Covariance x What happens on balance?
Calculate the average of the squared deviations. y
5
Covariance x What happens on balance?
Calculate the average of the squared deviations. y
6
Covariance Example x Sxy= 1.999 Wage y Aptitude
7
Correlation x rxy= 0.476 Wage y Aptitude
8
Perfect Correlation
9
Fit That Line ! y=2,500+1,800x y=10,000+1,000x y=13, x
10
Fit That Line ! y=8, ,233x minimizes the squared errors
11
Word Problem Students in a small class were polled by a researcher attempting to establish a relationship between hours of study in a week preceding a test and the result of the test. If you get data on hours studied and exam results, which variable is the dependent variable? why?
12
Word Problem y= x
13
Regression Statistics
Word Problem Excel Regression Output (Data Analysis Add-In) Regression Statistics Multiple R 0.770 R Squared 0.594 Adj. R Squared 0.543 Standard Error 10.710 Obs. 10 ANOVA df SS MS F Significance Regression 1 11.686 0.009 Residual 8 Total 9 Coeff. Std. Error t stat p value Lower 95% Upper 95% Intercept 39.401 12.153 3.242 0.012 11.375 67.426 hours 2.122 0.621 3.418 0.691 3.554
14
Word Problem Excel Regression Output (StatPad Add-In)
Regression analysis to predict score from hours. The prediction equation is: Score = 39.401 2.122 hours 0.594 R squared 10.710 Standard error of estimate 10 Number of observations 11.686 F statistic 0.009 P value 95% Coeff LowerCI UpperCI StdErr t p Significant Constant 11.375 67.426 12.153 3.242 0.012 Yes (p<0.05) hours 2.122 0.691 3.554 0.621 3.418 Excel Regression Output (StatPad Add-In)
15
The Nine Lives of Goldfish
Regression Statistics Multiple R 0.671 R Squared 0.450 Adj. R Squared 0.340 Standard Error 45.214 Obs. 7 ANOVA df SS MS F Significance Regression 1 4.089 0.099 Residual 5 Total 6 Coeff. Std. Error t stat p value Lower 95% Upper 95% Intercept 91.500 22.607 4.047 0.010 33.387 filter 34.533 -2.022 18.936
16
Predicting Job Performance
Regression Statistics R Squared 0.107 Adj. R Squared Standard Error 1.955 Obs. 3525 ANOVA df SS MS F Significance Regression 3 0.000 Residual 3521 3.824 Total 3524 Coeff. Std. Error t stat p value Lower 95% Upper 95% Intercept 4.865 0.171 28.423 4.529 5.200 Age -0.037 0.002 -0.041 -0.034 Seniority 0.011 0.003 3.325 0.001 0.004 0.017 Cognitive -0.032 0.033 -0.983 0.326 -0.097 0.032 Simple Regression: Perform = – age
17
Predicting Job Performance
Perform = – age seniority cognitive Age 35 36 Seniority 10 Cognitive 1 Predicted Performance 3.626 3.589 Net Difference -0.037 45 46 10 1 3.251 3.214 -0.037 Age 35 Seniority 20 21 Cognitive 1 Predicted Performance 3.731 3.742 Net Difference 0.011 Note importance of ceteris paribus (all else constant)
18
Predicting Job Performance
Perform = – age seniority cognitive And holding seniority constant at 10 and cognitive constant at 1
19
Predicting Job Performance
Perform = – age seniority cognitive And holding seniority constant at 20 and cognitive constant at -1 With linear models, other values don’t matter; just all else constant
20
Predicting Job Perf. With a Dummy Variable
Regression Statistics R Squared 0.110 Adj. R Squared 0.109 Standard Error 1.953 Obs. 3525 ANOVA df SS MS F Significance Regression 34 0.000 Residual 3520 3.815 Total 3524 Coeff. Std. Error t stat p value Lower 95% Upper 95% Intercept 4.820 0.172 28.096 4.484 5.156 Age -0.037 0.002 -0.041 -0.034 Seniority 0.010 0.003 3.271 0.001 0.004 0.017 Cognitive -0.025 0.033 -0.756 0.450 -0.090 0.040 Structured int. 2.850 0.922 3.092 1.043 4.658 Structured Interview Dummy Variable: 1=yes, 0=no
21
Predicting Job Perf. With a Dummy Variable
Perform = – age seniority cognitive structured interview Age 35 Seniority 10 Cognitive 1 Structured Interview Predicted Performance 3.600 6.450 Net Difference 2.850 45 5 2 1 3.155 6.005 2.850 Dummy variable turns “on” and “off” with all else constant.
22
Predicting Job Perf. With a Dummy Variable
Perform = – age seniority cognitive structured interview And holding seniority constant at 10 and cognitive constant at 1
23
Predicting Job Perf. With a Dummy Variable
Note new y-intercept Seniority=20, Cognitive=0
24
Multiple Dummy Variables
Source | SS df MS Number of obs = F( 14, 3510) = Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = perform | Coef. Std. Err t P>|t| [95% Conf. Interval] age | seniorty | cognitve | strucint | job1 | job2 | job3 | job4 | job5 | job6 | job7 | job8 | job9 | job10 | _cons | Note: job1-job10 are dummy variables representing 10 different job classes (job11 is the omitted reference category)
25
Interaction Variables
Source | SS df MS Number of obs = F( 6, 3518) = Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = perform | Coef. Std. Err t P>|t| [95% Conf. Interval] age | seniorty | cognitve | strucint | manual | manl_age | _cons | Note: manual is a dummy variable indicating a manual occupation; manl_age is age interacted with manual (i.e. manl_age = manual*age)
26
Interaction Variables
Note different slopes, too. Seniority=20, Cognitive=0, StrucInt=0
27
Another Interaction Variable Example
Source | SS df MS Number of obs = F( 5, 15315) = Model | Prob > F = Residual | e R-squared = Adj R-squared = Total | e Root MSE = earnwkly | Coef. married | female | exper | parttime | exp_pt | _cons | exper is potential labor market experience (age-educ-6) parttime is a dummy variable indicating a part-time worker exp_pt is exper interacted with perttime (i.e. exp_pt = exper*parttime)
28
Interaction Variables
Married=1, Female=1
29
Adjusted R2 Source | SS df MS Number of obs = 3525
Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = perform | Coef. Std. Err t P>|t| [95% Conf. Interval] age | seniorty | cognitve | strucint | job1 | job2 | job3 | job4 | job5 | job6 | job7 | job8 | job9 | job10 | _cons | Note: job1-job10 are dummy variables representing 10 different job classes (job11 is the omitted reference category)
30
Causality ? Workforce Optimization Sue Bostrom: Leadership on IT—What’s It Worth? September 10, 2001 “For those who still doubt that Internet-related investments will pay off, consider this: A PricewaterhouseCoopers study released earlier this year found that productivity gains in 2000 were 2.7 times greater for Internet-enabled companies than for businesses that have not leveraged the Web.”
31
Causality Reasons for an estimated statistical relationship
The explanatory variable is the direct cause of the response (dependent) variable The response variable is causing a change in the explanatory variable (reverse causality) The explanatory variable is a contributing, but not sole, cause of the response variable Confounding variables may exist Both variables may stem from a common cause Both variables are changing over time Coincidence Source: Jessica M. Utts (1999) Seeing Through Statistics, 2nd ed., Pacific Grove, CA: Duxbury, p. 186.
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.