CORRELATION & REGRESSION compiled by Dr Kunal Pathak

Slides:



Advertisements
Similar presentations
Correlation and regression Dr. Ghada Abo-Zaid
Advertisements

Probabilistic & Statistical Techniques Eng. Tamer Eshtawi First Semester Eng. Tamer Eshtawi First Semester
CORRELATION.
BIVARIATE DATA: CORRELATION AND REGRESSION Two variables of interest: X, Y. GOAL: Quantify association between X and Y: correlation. Predict value of Y.
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 13 Introduction to Linear Regression and Correlation Analysis.
SIMPLE LINEAR REGRESSION
Linear Regression and Correlation Analysis
SIMPLE LINEAR REGRESSION
Correlation and Regression Analysis
Correlation and Regression 1. Bivariate data When measurements on two characteristics are to be studied simultaneously because of their interdependence,
Correlation & Regression Math 137 Fresno State Burger.
Lecture 16 Correlation and Coefficient of Correlation
Linear Regression and Correlation
SIMPLE LINEAR REGRESSION
Relationship of two variables
SHOWTIME! STATISTICAL TOOLS IN EVALUATION CORRELATION TECHNIQUE SIMPLE PREDICTION TESTS OF DIFFERENCE.
Correlation.
Irkutsk State Medical University Department of Faculty Therapy Correlations Khamaeva A. A. Irkutsk, 2009.
Probabilistic and Statistical Techniques 1 Lecture 24 Eng. Ismail Zakaria El Daour 2010.
WELCOME TO THETOPPERSWAY.COM.
Applied Quantitative Analysis and Practices LECTURE#23 By Dr. Osman Sadiq Paracha.
Correlation Analysis. A measure of association between two or more numerical variables. For examples height & weight relationship price and demand relationship.
Correlation.
Basic Concepts of Correlation. Definition A correlation exists between two variables when the values of one are somehow associated with the values of.
Y X 0 X and Y are not perfectly correlated. However, there is on average a positive relationship between Y and X X1X1 X2X2.
Introduction to Correlation Analysis. Objectives Correlation Types of Correlation Karl Pearson’s coefficient of correlation Correlation in case of bivariate.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
CORRELATION. Correlation key concepts: Types of correlation Methods of studying correlation a) Scatter diagram b) Karl pearson’s coefficient of correlation.
Chapter 9: Correlation and Regression Analysis. Correlation Correlation is a numerical way to measure the strength and direction of a linear association.
Scatter Diagrams scatter plot scatter diagram A scatter plot is a graph that may be used to represent the relationship between two variables. Also referred.
Correlation & Regression
Multivariate Data. Descriptive techniques for Multivariate data In most research situations data is collected on more than one variable (usually many.
Correlation They go together like salt and pepper… like oil and vinegar… like bread and butter… etc.
Lecture 29 Dr. MUMTAZ AHMED MTH 161: Introduction To Statistics.
STATISTICS 12.0 Correlation and Linear Regression “Correlation and Linear Regression -”Causal Forecasting Method.
CORRELATION ANALYSIS.
Correlation and Regression. O UTLINE Introduction  10-1 Scatter plots.  10-2 Correlation.  10-3 Correlation Coefficient.  10-4 Regression.
Correlation and regression by M.Shayan Asad
CORRELATION. Correlation  If two variables vary in such a way that movement in one is accompanied by the movement in other, the variables are said to.
Correlation and Regression Note: This PowerPoint is only a summary and your main source should be the book.
Correlation and Regression Lecturer : FATEN AL-HUSSAIN Note: This PowerPoint is only a summary and your main source should be the book.
Theme 5. Association 1. Introduction. 2. Bivariate tables and graphs.
Inference about the slope parameter and correlation
Correlation & Forecasting
Simple Linear Correlation
CHAPTER 10 & 13 Correlation and Regression
Regression and Correlation
CORRELATION.
Correlation & Regression
Multivariate Data.
Correlation – Regression
Chapter 5 STATISTICS (PART 4).
Correlation A Lecture for the Intro Stat Course
Correlation and Regression
Correlation and Simple Linear Regression
CORRELATION.
Lecture Slides Elementary Statistics Thirteenth Edition
Correlation and Regression
Examining Relationships in Data
CHAPTER- 17 CORRELATION AND REGRESSION
Lecture Notes The Relation between Two Variables Q Q
CORRELATION ANALYSIS.
Correlation and Simple Linear Regression
Simple Linear Regression and Correlation
Coefficient of Correlation
Product moment correlation
بسم الله الرحمن الرحيم. Correlation & Regression Dr. Moataza Mahmoud Abdel Wahab Lecturer of Biostatistics High Institute of Public Health University.
Correlation & Regression
Honors Statistics Review Chapters 7 & 8
Presentation transcript:

CORRELATION & REGRESSION compiled by Dr Kunal Pathak

Flow of Presentation Correlation Regression Definition Types of correlation Method of studying correlation (Scatter Diagram, Karl Pearson’s correlation and Spearman’s rank correlation) Partial Correlation Regression Regression lines Methods to find regression lines (Scatter diagram and Least square method) Multiple Regression

Correlation The degree of relationship between the variables under consideration is measure through the correlation analysis. The measure of correlation called the correlation coefficient. It is denoted by ‘r’. The degree of relationship is expressed by coefficient which range from correlation ( -1 ≤ r ≤ 1) The correlation analysis enable us to have an idea about the degree & direction of the relationship between the two variables under study.

Correlation & Causation Causation means cause & effect relation. Change in one variable accompanied by others is called causation Correlation denotes the interdependency among the variables for correlating two phenomenon having causation Causation implies correlation but reverse is not true

Types of Correlation Type I: Direction of the Correlation Positive Correlation Negative Correlation

Examples: water consumption and temperature study time and grades Positive relationships water consumption and temperature study time and grades Negative relationships alcohol consumption and driving ability Price & crops

Correlation Simple Multiple Type II: Number of variables considered Correlation Simple Multiple Partial Total

Type III: Relationship assumed Correlation LINEAR NON LINEAR

Methods of Studying Correlation Scatter Diagram Method Karl Pearson’s Coefficient of Correlation Spearman’s Rank Correlation Coefficient

Scatter Diagram Method Scatter Diagram is a graph of observed plotted points where each points represents the values of X & Y as a coordinate. It portrays the relationship between these two variables graphically.

A perfect positive correlation Height Weight Height of B Weight of B Height of A Weight of A A linear relationship

High Degree of positive correlation Positive relationship r = +.80 Weight Height

Degree of correlation r = + 0.4 Shoe Size Weight Moderate Positive Correlation r = + 0.4 Shoe Size Weight

Degree of correlation r = -1.0 TV watching per week Exam score Perfect Negative Correlation r = -1.0 TV watching per week Exam score

Degree of correlation r = -.80 TV watching per week Exam score Moderate Negative Correlation r = -.80 TV watching per week Exam score

Degree of correlation Shoe Size r = - 0.2 Weight Weak negative Correlation Shoe Size r = - 0.2 Weight

Degree of correlation r = 0.0 IQ Height No Correlation (horizontal line) r = 0.0 IQ Height

Advantages and Disadvantage of Scatter Diagram Simple & Non Mathematical method Not influenced by the size of extreme item First step in investing the relationship between two variables Disadvantage: Can not adopt the an exact degree of correlation

Karl Pearson's Coefficient of Correlation Pearson’s correlation coefficient (r) is the most common correlation coefficient. The coefficient of correlation ‘r’ measure the degree of linear relationship between two variables say x & y.

Interpretation of Correlation Coefficient (r) If r = +1, then the correlation between the two variables is said to be perfect and positive If r = -1, then the correlation between the two variables is said to be perfect and negative If r = 0, then there exists no correlation between the variables If 0<r<1, then the correlation between the two variables is said to be partial and positive If -1<r<0, then the correlation between the two variables is said to be partial and negative

Assumptions of Pearson’s Correlation Coefficient There is linear relationship between two variables, i.e. when the two variables are plotted on a scatter diagram a straight line will be formed by the points. Cause and effect relation exists between different forces operating on the item of the two variable series.

Example: Find the correlation coefficient using Karl Pearson’s method for the following data. 6 2 10 4 8 y 9 11 5 7

Solution: X Y XY X*X Y*Y 6 9 54 36 81 2 11 22 4 121 10 5 50 100 25 8 32 16 64 7 56 49 30 40 214 220 340

Advantages of Pearson’s Coefficient It summarizes in one value, the degree of correlation & direction of correlation also

Limitation of Pearson’s Coefficient Always assume linear relationship Interpreting the value of r is difficult Value of Correlation Coefficient is affected by the extreme values Time consuming methods

Coefficient of Determination Coefficient of Determination: square of coefficient of correlation e.g. r = 0.9, r2 = 0.81 This would mean that 81% of the variation in the dependent variable has been explained by the independent variable.

Example: Suppose: r = 0.60, r = 0.30 It does not mean that the first correlation is twice as strong as the second the ‘r’ can be understood by computing the value of r2 . When r = 0.60 r2 = 0.36 -----(1) r = 0.30 r2 = 0.09 -----(2) This implies that in the first case 36% of the total variation is explained whereas in second case 9% of the total variation is explained .

Partial Correlation Let x1, x2, …, xn: n variables Correlation between xi and xj and removing influence of other variables is called partial correlation. It is denoted by rij.123…i-1,i+1,….,j-1,j+1,…..n e.g.

Example: 1. If r12 =0.6, r13=0.2, r23=0.8, find r12.3

Spearman’s Rank Coefficient of Correlation When statistical series in which the variables under study are not capable of quantitative measurement but can be arranged in serial order, in such situation Pearson’s correlation coefficient can not be used in such case Spearman’s Rank correlation can be used.

Interpretation of Rank Correlation Coefficient (R) The value of rank correlation coefficient, R ranges from - 1 to +1 If R = +1, then there is complete agreement in the order of the ranks and the ranks are in the same direction If R = -1, then there is complete agreement in the order of the ranks and the ranks are in the opposite direction If R = 0, then there is no correlation

Rank Correlation Coefficient (R) Equal Ranks or tie in Ranks: In such cases average ranks should be assigned to each individual and mi: The number of time an item is repeated AF: Adjustment Factor

Example: Find the Spearman’s rank correlation coefficient for the following data x 39 65 62 90 82 75 25 98 36 78 y 47 53 58 86 68 60 91 51 84

Solution: X y Rx Ry D D*D 39 47 8 10 -2 4 65 53 6 62 58 7 90 86 2 82 3 90 86 2 82 3 5 75 68 1 25 60 16 98 91 36 51 9 78 84

Exp. 2: A physiologist wants to compare two methods A and B of teaching. He selected a random sample of 22 students. He grouped them into 11 pairs have approximately equal scores in an intelligence test. In each pair one student was taught by method A and the other by method B and examined after the course. The marks obtained by them as follows Pair 1 2 3 4 5 6 7 8 9 10 11 A 24 29 19 14 30 27 20 28 B 37 35 16 26 23 21

Solution: A B R_A R_B D D*D 24 37 6 1 5 25 29 35 3 2 19 16 8.5 9.5 -1 14 26 10 4 36 30 23 1.5 -3.5 12.25 27 5.5 30.25 8 -3 9 20 7 -5.5 -2.5 6.25 28 11 -7 49 21

Advantages of Spearman’s Rank Correlation This method is simpler to understand and easier to apply compared to Karl Pearson’s correlation method. This method is useful where we can give the ranks and not the actual data. (qualitative term) This method is to use where the initial data in the form of ranks.

Disadvantages of Spearman’s Correlation Cannot be used for finding out correlation in a grouped frequency distribution.

Advantages of Correlation studies Show the amount (strength) of relationship present Can be used to make predictions about the variables under study Easier to collect co relational data

Regression It is a statistical technique that defines the functional relationship between two variables.

Regression lines (linear regression) Methods: Scatter Diagram Least square method

Regression Line y on x:

Regression Line x on y:

Properties: r * r = byx * bxy byx and bxy both are of same sign byx and bxy simultaneously can not exceed 1 Both lines intersect each other at means If r = +- 1: lines coincides, r=0 : lines are perpendicular

Examples: Obtain the regression lines using the following data and hence find the correlation coefficient. X 1 2 3 4 5 6 7 Y 9 8 10 12 11 13 14

2. In a partially destroyed laboratory records on the analysis of correlation data, only the following are legible: Find means and std. deviation in y.

Multiple Regression Multiple regression is used to estimate the value of dependent variable in terms of two or more independent variables. e.g. x, y, z : correlated variables, then Regression plane x on y & z is,

Examples: Find regression plane z on x & y for the following data X 1.2 2.3 3.4 4.5 5.6 6.7 7.8 8.9 9 Y 2 4 6 8 7 10 14 16 18 Z 2.1 4.1 6.1 8.1 7.1 10.2 14.2 16.2 18.2

Thank You