Presentation is loading. Please wait.

Presentation is loading. Please wait.

WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam WFM 5201: Data Management and Statistical Analysis Akm Saiful.

Similar presentations


Presentation on theme: "WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam WFM 5201: Data Management and Statistical Analysis Akm Saiful."— Presentation transcript:

1 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam WFM 5201: Data Management and Statistical Analysis Akm Saiful Islam Lecture-6: Correlation and Regression Analysis June, 2008 Institute of Water and Flood Management (IWFM) Bangladesh University of Engineering and Technology (BUET)

2 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Correlation Correlation is concerned with describing the direction (positive or negative) and strength of a relationship between two variables. Correlation makes no distinction between the two variables (it is a measure of how they vary jointly), whereas regression theory depends on a dependent variable being affected by an error- free independent variable.

3 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Correlation coefficient The direction and strength of the relationship can be expressed by means of a correlation coefficient “r”, which is mathematically defined as: The sum of cross products of deviations

4 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam The sum of squared deviations for X The sum of squared deviations for Y Correlation coefficient

5 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Pearson’s “r”

6 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Correlation coefficient A correlation coefficient varies from -1 to +1 -1 indicating a perfect negative relationship (one increase while other decrease), 0 indicating no relationship +1 indicating a perfect positive relationship. The size of the correlation indicates the strength of the relationship; for example, the correlation coefficient -0.89 indicates a stronger relationship than a coefficient of +0.60.

7 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Linear Regression Regression is primarily concerned with using the relationship for the purpose of predicting one variable from knowledge of the other Correlation, on the other hand, is primarily concerned with discovering whether or not a relationship exists in the first place, and then specifying the strength and direction of this relationship.

8 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Linear Regression The simple linear regression equation is given as: X = given data b 0 = intercept of regression line b 1 = slope of regression line It is also known as least squares method

9 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Regression line

10 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Coefficient of Regression

11 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Coefficient of Determination The decomposition of the sample variation of leads to a measure of the "goodness of fit", which is known as the coefficient of determination and denoted by R 2. Note:

12 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Coefficient of determination is a measure commonly used to describe how well the sample regression line fits the observed data. Range: 0 means poorest, 1 best fit of regression model

13 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Exercise-1: Fit regression equation between Boro production and rainfall and find R 2 YearBoro ProductionRainfall 1975-76424536216 1976-77152273319 1977-78437007164 1978-79278287141 1979-80417225237 1980-81500207197 1981-82395940255 1982-83418170221

14 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Deviations or Errors The sum of squares of these deviations from the fitted line is: Total = Explained + unexplained deviation deviation deviation

15 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Total, explained, and unexplained deviation

16 WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam Regression diagnostics Patterns for residual plots (a) satisfactory (b) funnel, (c) double bow (d) non-linear


Download ppt "WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam WFM 5201: Data Management and Statistical Analysis Akm Saiful."

Similar presentations


Ads by Google