Presentation is loading. Please wait.

Presentation is loading. Please wait.

CORRELATION AND MULTIPLE REGRESSION ANALYSIS

Similar presentations


Presentation on theme: "CORRELATION AND MULTIPLE REGRESSION ANALYSIS"— Presentation transcript:

1 CORRELATION AND MULTIPLE REGRESSION ANALYSIS
By PROF. KAMBALE F.J. DEPARTMENT OF ECONOMICS, Rayat Shikshan Sanstha’s S.M.JOSHI COLLEGE HADAPSAR, PUNE

2 Business Physical Sciences Areas where Health & STATISTICS Medicine
Economics, Engineering, Marketing, Computer Science Physical Sciences Astronomy, Chemistry, Physics Areas where STATISTICS are used Health & Medicine Genetics, Clinical Trials, Epidemiology, Pharmacology Environment Agriculture, Ecology, Forestry, Animal Populations Government Census, Law, National Defense

3 Measures associations between two variables. Used to
CORRELATION ANALYSIS Simple Correlation Measures associations between two variables. Used to establish relationship between variables. Karl Pearson’s method is commonly used to establish relationship. The correlation can be Positive, Negative, Perfect and zero. It takes values between -1 to +1

4 Correlation Measures the relative strength of the linear relationship between two variables Unit-less Ranges between –1 and 1 The closer to –1, the stronger the negative linear relationship The closer to 1, the stronger the positive linear relationship The closer to 0, the weaker any positive linear relationship

5 Correlation coefficient
Pearson’s Correlation Coefficient is standardized covariance (unitless):

6 Estimation Var(x) = = Var(y) = Cov(xy) = R2=“Coefficient of etermination” = SSexplained/TSS  Interpretation of R2: 50% of the total variation in the sum of the two dice is explained by the roll on the first die. Makes perfect intuitive sense!

7 Scatter Plots of Data with Various Correlation Coefficients
Y Y Y X X X r = -1 r = -.6 r = 0 Y Y Y X X X r = +1 r = +.3 r = 0 Slide from: Statistics for Managers Using Microsoft® Excel 4th Edition, 2004 Prentice-Hall

8 Linear Correlation Linear relationships Curvilinear relationships Y Y
X X Y Y X X Slide from: Statistics for Managers Using Microsoft® Excel 4th Edition, 2004 Prentice-Hall

9 Linear Correlation Strong relationships Weak relationships Y Y X X Y Y
Slide from: Statistics for Managers Using Microsoft® Excel 4th Edition, 2004 Prentice-Hall

10 Linear Correlation No relationship Y X Y X
Slide from: Statistics for Managers Using Microsoft® Excel 4th Edition, 2004 Prentice-Hall

11 Some calculation formulas…
Note: Easier computation formulas:

12 PARTIAL CORRELATION Partial Correlation The correlation between two variable may be because of other variables in the study. In partial correlation the effect of other variable is eliminated or kept constant

13 MULTIPLE CORRELATION Multiple Correlation
Definition: Joint effect of K independent variables on the dependent variable is studied Coefficient of multiple determination (R square) The square of the multiple correlation is called

14 Multiple Correlation Consider following Multiple Linear Regression equation Y= a +b1 X1 +b2 X2 +b3 X3+e Y: Yield of Crop ( Dependent variable ) X1: N-Levels X2: Phosphorous levels X3: Potash Levels Let R square = 0.45 CONCLUSION: Indicated that 45% variation in dependent variable ( Y has been explained by three independent variables in the study. It has application in all fields

15 Regression Analysis Regression: The functional relationship between two or more variables is studied in regression. Regression Coefficient: Measures change in the values of dependent variable for unit change in the values of independent variable.

16 Multiple Regression Multiple Regression Y= a +b1 X1 +b2 X2 +b3 X3+b4 X4 + b5 X+…..+ e Y: Dependent variable X1, X2, X3…… are Independent Variables R Square : Measures variation covered by various independent variables in dependent variable It has application in all fields

17 Multivariate regression pitfalls
Multi-collinearity Residual confounding Overfitting

18 Regression coefficient
The regression coefficient is the slope of the regression line and tells you what the nature of the relationship between the variables is. How much change in the independent variables is associated with how much change in the dependent variable. The larger the regression coefficient the more change.

19 Regression Line Regression line is the best straight line description of the plotted points and use can use it to describe the association between the variables. If all the lines fall exactly on the line then the line is 0 and you have a perfect relationship.

20 Regression Coefficients
B - These are the values for the regression equation for predicting the dependent variable from the independent variable. These are called unstandardized coefficients because they are measured in their natural units.  As such, the coefficients cannot be compared with one another to determine which one is more influential in the model, because they can be measured on different scales. 

21 Coefficients for Two Independent Variables
This chart looks at two variables and shows how the different bases affect the B value. That is why you need to look at the standardized Beta to see the differences.

22 USING SPSS When you run regression analysis on SPSS you get a 3 tables. Each tells you something about the relationship. The first is the model summary. The R is the Pearson Product Moment Correlation Coefficient. In this case R is .736 R is the square root of R-Squared and is the correlation between the observed and predicted values of dependent variable.

23 R-Square R-Square is the proportion of variance in the dependent variable (income per capita) which can be predicted from the independent variable (level of education).  This value indicates that 54.2% of the variance in income can be predicted from the variable education.  R-Square is also called the coefficient of determination.

24 Adjusted R-square The adjusted R-square attempts to yield a more honest value to estimate the R-squared for the population.   The value of R-square was .542, while the value of Adjusted R-square was There isn’t much difference because we are dealing with only one variable.  When the number of observations is small and the number of predictors is large, there will be a much greater difference between R-square and adjusted R-square. .

25 ANOVA If the p-value were greater than 0.05, you would say that the group of independent variables does not show a statistically significant relationship with the dependent variable, or that the group of independent variables does not reliably predict the dependent variable. 

26 Regression Coefficients
Beta - The are the standardized coefficients.  These are the coefficients that you would obtain if you standardized all of the variables in the regression, including the dependent and all of the independent variables, and ran the regression.  By standardizing the variables before running the regression, you have put all of the variables on the same scale, and you can compare the magnitude of the coefficients to see which one has more of an effect.  You will also notice that the larger betas are associated with the larger t-values.

27 How to translate a typical table
Regression Analysis Level of Education by Income per capita

28 Multiple Regression Single

29 Single Regression Multiple Regression

30 Research Designs Sampling Technique Random Sampling Purposive Sampling

31 Simple Random Sampling
Sample Survey Designs Simple Random Sampling Samples are selected randomly without disturbing the population Stratified Random Sampling Population is first divided in to homogeneous sub groups called strata and from each strata samples are randomly selected Probability Proportionate To Size (PPS) Samples are selected proportionate to size of population

32 Economics and Social Science
Simple Growth Rate Gives growth in absolute form Compound Growth Rate Indicates percent per annum growth in Output

33 Economics and Social Science
Index Numbers Change over base year is studied Time Series Analysis Trend over a period of time generally taken as years Seasonal Variation Analysis related to change over season

34 Statistical Softwares
SPSS SAS SIS STAT M STAT INDO STAT

35 THERE IS STRENGTH IN NUMBERS
And Always Remember THERE IS STRENGTH IN NUMBERS

36 THANK YOU


Download ppt "CORRELATION AND MULTIPLE REGRESSION ANALYSIS"

Similar presentations


Ads by Google