Presentation is loading. Please wait.

Presentation is loading. Please wait.

Regression Concept & Examples, Latent Variables, & Partial Least Squares (PLS) 1.

Similar presentations


Presentation on theme: "Regression Concept & Examples, Latent Variables, & Partial Least Squares (PLS) 1."— Presentation transcript:

1 Regression Concept & Examples, Latent Variables, & Partial Least Squares (PLS) 1

2 Simple Regression Model Make prediction about the starting salary of a current college graduate Data set of starting salaries of recent college graduates 2 Data SetCompute Average Salary How certain are of this prediction? There is variability in the data.

3 Compute Total Variation Simple Regression Model The smaller the amount of total variation the more accurate (certain) will be our prediction. Use total variation as an index of uncertainty about our prediction 3

4 Simple Regression Model How “explain” the variability - Perhaps it depends on the student’s GPA 4 Salary GPA

5 Find a linear relationship between GPA and starting salary As GPA increases/decreases starting salary increases/decreases 5 Simple Regression Model

6 Least Squares Method to find regression model – Choose a and b in regression model (equation) so that it minimizes the sum of the squared deviations – actual Y value minus predicted Y value (Y-hat) 6 Simple Regression Model

7 How good is the model? 7 Simple Regression Model  u-hat is a “residual” value  The sum of all u-hats is zero  The sum of all u-hats squared is the total variance not explained by the model  “unexplained variance” is 7,425,926 a= 4,779 & b = 5,370 A computer program computed these values

8 Simple Regression Model 8 Total Variation = 23,000,000

9 9 Simple Regression Model Total Unexplained Variation = 7,425,726

10 Simple Regression Model Relative Goodness of Fit – Summarize the improvement in prediction using regression model Compute R 2 – coefficient of determination 10 Regression Model (equation) a better predictor than guessing the average salary The GPA is a more accurate predictor of starting salary than guessing the average R 2 is the “performance measure“ for the model. Predicted Starting Salary = 4,779 + 5,370 * GPA

11 Detailed Regression Example 11

12 Data Set 12 Obs #SalaryGPAMonths Work 1200002.848 2245003.424 3230003.224 4250003.824 5200003.248 6225003.436 7275004.020 8190002.648 9240003.236 10285003.812

13 Scatter Plot - GPA vs Salary 13

14 Scatter Plot - Work vs Salary 14

15 Pearson Correlation Coefficients -1 <= r <= 1 15 SalaryGPA Months Work Salary1 GPA0.8980071 Months Work-0.93927-0.829931

16 Three Regressions Salary = f(GPA) Salary = f(Work) Salary = f(GPA, Work) Interpret Excel Output 16

17 Interpreting Results Regression Statistics – Multiple R, – R 2, – R 2 adj – Standard Error S y Statistical Significance – t-test – p-value – F test 17

18 Regression Statistics Table 18 Multiple R – R = square root of R 2 R 2 – Coefficient of Determination R 2 adj – used if more than one x variable Standard Error S y – This is the sample estimate of the standard deviation of the error (actual – predicted)

19 ANOVA Table 19 Table 1 gives the F statistic Tests the claim – there is no significant relationship between your all of your independent and dependent variables The significance F value is a p-value should reject the claim: – Of NO significant relationship between your independent and dependent variables if p<  – Generally  = 0.05

20 Regression Coefficients Table 20 Coefficients Column gives – b 0, b 1,, b 2, …, b n values for the regression equation. – The b 0 is the intercept – b 1 value is next to your independent variable x 1 – b 2 is next to your independent variable x 2. – b 3 is next to your independent variable x 3

21 Regression Coefficients Table p values for individual t tests each independent variables t test - tests the claim that there is no relationship between the independent variable (in the corresponding row) and your dependent variable. Should reject the claim Of NO significant relationship between your independent variable (in the corresponding row) and dependent variable if p< . 21

22 Salary = f(GPA) 22 Regression Statisticsf(GPA) Multiple R 0.898006642 R Square 0.806415929 Adjusted R Square 0.78221792 Standard Error 1479.019946 Observations 10 ANOVA dfSSMSFSignificance F Regression172900000 33.325710.00041792 Residual8175000002187500 Total990400000 Coefficients Standard Errort StatP-valueLower 95%Upper 95% Intercept1928.5714293748.6770.5144670.620833-6715.8932610573.04 GPA6428.5714291113.5895.7728430.0004183860.631738996.511

23 Salary = f(Work) 23 Regression Statisticsf(Work) Multiple R0.939265177 R Square0.882219073 Adjusted R Square0.867496457 Standard Error1153.657002 Observations10 ANOVA dfSSMSFSignificance F Regression179752604.177975260459.922715.52993E-05 Residual810647395.831330924 Total990400000 Coefficients Standard Errort StatP-valueLower 95%Upper 95% Intercept30691.666671010.13634430.383691.49E-0928362.2880833021.0453 Months Work-227.86458329.43615619-7.740985.53E-05 - 295.7444812-159.98469

24 Salary = f(GPA, Work) 24 Regression Statistics f(GPA,Work) Multiple R0.962978985 R Square0.927328525 Adjusted R Square0.906565246 Standard Error968.7621974 Observations10 ANOVA dfSSMSFSignificance F Regression2838304994191524944.661950.00010346 Residual76569501938500.2 Total990400000 Coefficients Standard Errort StatP-valueLower 95%Upper 95% Intercept19135.928965608.1843.4121440.0112555874.68211232397.176 GPA2725.4098361307.4682.0844950.075582-366.26029835817.08 Months Work-151.212431744.30826-3.412740.011246-255.9848174 - 46.440046

25 Compare Three “Models” 25 Regression Statistics f(GPA,Work) Multiple R0.962978985 R Square0.927328525 Adjusted R Square0.906565246 Standard Error968.7621974 Observations10 Regression Statisticsf(Work) Multiple R0.939265177 R Square0.882219073 Adjusted R Square0.867496457 Standard Error1153.657002 Observations10 Regression Statisticsf(GPA) Multiple R0.898006642 R Square0.806415929 Adjusted R Square0.78221792 Standard Error1479.019946 Observations10

26 Latent Variables (Theoretical Entities) 26

27 Latent Variables – Explanatory Variables that are not directly measured – Identified by “Exploratory Factor Analysis” – Confirmed by “Confirmatory Factor Analysis” Statistical Methods for Latent Variables – Principles Components Analysis – PLS – SEM 27

28 Example: Confirmatory Factor Analysis Intention to Use Travelocity Website 28

29 Research Instrument 29

30 30

31 31

32 32


Download ppt "Regression Concept & Examples, Latent Variables, & Partial Least Squares (PLS) 1."

Similar presentations


Ads by Google