Regression Analysis British Biometrician Sir Francis Galton was the one who used the term Regression in the later part of 19 century.

Slides:



Advertisements
Similar presentations
Managerial Economics in a Global Economy
Advertisements

Statistical Techniques I EXST7005 Simple Linear Regression.
Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 ~ Curve Fitting ~ Least Squares Regression Chapter.
1 Functions and Applications
Correlation and regression Dr. Ghada Abo-Zaid
Chapter 8 Linear Regression © 2010 Pearson Education 1.
Chapter 10 Regression. Defining Regression Simple linear regression features one independent variable and one dependent variable, as in correlation the.
1 Topic 1 Topic 1 : Elementary functions Reading: Jacques Section Graphs of linear equations Section 2.1 – Quadratic functions Section 2.2 – Revenue,
The General Linear Model. The Simple Linear Model Linear Regression.
Chapter 15 (Ch. 13 in 2nd Can.) Association Between Variables Measured at the Interval-Ratio Level: Bivariate Correlation and Regression.
Statistics for Business and Economics
CORRELATION.
SIMPLE LINEAR REGRESSION
1 Simple Linear Regression Chapter Introduction In this chapter we examine the relationship among interval variables via a mathematical equation.
Chapter 11 Multiple Regression.
Lecture 5 Curve fitting by iterative approaches MARINE QB III MARINE QB III Modelling Aquatic Rates In Natural Ecosystems BIOL471 © 2001 School of Biological.
SIMPLE LINEAR REGRESSION
C82MCP Diploma Statistics School of Psychology University of Nottingham 1 Linear Regression and Linear Prediction Predicting the score on one variable.
Correlation and Regression Analysis
CORRELATION Correlation analysis deals with the association between two or more variables. Simphson & Kafka If two or more quantities vary in sympathy,
Linear Regression.
SIMPLE LINEAR REGRESSION
Introduction to Linear Regression and Correlation Analysis
LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 1 Lecture 5 Slide 1 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall.
1 Least squares procedure Inference for least squares lines Simple Linear Regression.
Regression Analysis: “Regression is that technique of statistical analysis by which when one variable is known the other variable can be estimated. In.
Chapter 6 & 7 Linear Regression & Correlation
Statistics for Business and Economics Chapter 10 Simple Linear Regression.
3/2003 Rev 1 I – slide 1 of 33 Session I Part I Review of Fundamentals Module 2Basic Physics and Mathematics Used in Radiation Protection.
WELCOME TO THETOPPERSWAY.COM.
© 2001 Prentice-Hall, Inc. Statistics for Business and Economics Simple Linear Regression Chapter 10.
?v=cqj5Qvxd5MO Linear and Quadratic Functions and Modeling.
Basic Concepts of Correlation. Definition A correlation exists between two variables when the values of one are somehow associated with the values of.
Y X 0 X and Y are not perfectly correlated. However, there is on average a positive relationship between Y and X X1X1 X2X2.
10B11PD311 Economics REGRESSION ANALYSIS. 10B11PD311 Economics Regression Techniques and Demand Estimation Some important questions before a firm are.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
MARKETING RESEARCH CHAPTER 18 :Correlation and Regression.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 13-1 Introduction to Regression Analysis Regression analysis is used.
CORRELATION. Correlation key concepts: Types of correlation Methods of studying correlation a) Scatter diagram b) Karl pearson’s coefficient of correlation.
Copyright © Cengage Learning. All rights reserved. 13 Linear Correlation and Regression Analysis.
CHAPTER 5 CORRELATION & LINEAR REGRESSION. GOAL : Understand and interpret the terms dependent variable and independent variable. Draw a scatter diagram.
Correlation – Recap Correlation provides an estimate of how well change in ‘ x ’ causes change in ‘ y ’. The relationship has a magnitude (the r value)
fall ‘ 97Principles of MicroeconomicsSlide 1 This is a PowerPoint presentation on fundamental math tools that are useful in principles of economics. A.
Economics 173 Business Statistics Lecture 10 Fall, 2001 Professor J. Petry
1 Simple Linear Regression and Correlation Least Squares Method The Model Estimating the Coefficients EXAMPLE 1: USED CAR SALES.
4 basic analytical tasks in statistics: 1)Comparing scores across groups  look for differences in means 2)Cross-tabulating categoric variables  look.
WEEK 6 Day 1. Progress report Thursday the 11 th.
Chapter 14 Introduction to Regression Analysis. Objectives Regression Analysis Uses of Regression Analysis Method of Least Squares Difference between.
REGRESSION ANALYSIS REGRESSION ANALYSIS. MEANING OF REGRESSION: The dictionary meaning of the word Regression is ‘Stepping back’ or ‘Going back’. Regression.
Correlation and regression by M.Shayan Asad
Linear Regression Essentials Line Basics y = mx + b vs. Definitions
CORRELATION.
TIME SERIES ANALYSIS.
A Session On Regression Analysis
Chapter 5 STATISTICS (PART 4).
CORRELATION.
Correlation and Regression
LESSON 21: REGRESSION ANALYSIS
REGRESSION.
REGRESSION ANALYSIS.
Undergraduated Econometrics
Regression Lecture-5 Additional chapters of mathematics
Single Regression Equation
SIMPLE LINEAR REGRESSION
SIMPLE LINEAR REGRESSION
SUBMITTED TO:- Mrs. Vanadana jain mam SUBMITTED BY:- Ravneet kaur, priyanka Sharma Roll no. 331,330 CONTENTS:- ●Meaning & definition ● Utility of correlation.
CORRELATION & REGRESSION compiled by Dr Kunal Pathak
REGRESSION ANALYSIS 11/28/2019.
Presentation transcript:

Regression Analysis British Biometrician Sir Francis Galton was the one who used the term Regression in the later part of 19 century.

Meaning of Regression Analysis Regression means the estimation or prediction of the unknown value of dependent variable from the known value of independent variable. In other words, regression analysis is a mathematical measure of the average relationship between two or more variables. Definition of Regression By M.M. Blair- Regression is the measure of the average relationship between two or more variables. By Hirsch- regression analysis measures the nature and extent of the relation between two or more variables, thus enables us to make predictions.

Utility of Regression 1. Regression analysis is essential for planning and policy making: It is widely used to estimate the coefficient of economic relationship. These coefficient helps in the formulation of economic policy of govt. 2. Testing Economic Theory: Regression analysis is one of the basic tool to test the accuracy of economic theory. 3. Prediction: By regression analysis, the value of dependent variable can be predicted on the basis of the value of an independent variable. For example, if price of a commodity rises, what will be the probable fall in demand, this can be predicted by regression. 4. Regression is used to study the functional relationship: In agriculture, it is important to study the response of fertilizer.

Difference between Regression and Correlation Basis Correlation Regression Degree and nature of Relationship Correlation is a measure of degree of relationship between X and Y. Regression studies the relationship between two variables so that one may be able to predict the value of one variable on the basis of another. 2. Cause and Effect Relationship Correlation does not always assume cause and effect relationship. Regression analysis expresses the cause and effect relationship. 3. Prediction Correlation does not make any prediction. Regression analysis enable us to make prediction. 4. Non-sense Correlation: In Correlation analysis, sometimes there is a non-sense correlation like between rise in income and rise in weight In regression analysis there is nothing like non-sense regression.

Types of Regression Analysis 1. Simple and Multiple Regression: In simple regression analysis, we study only two variables at a time, in which one variable is dependent and another is independent. The relationship between income and expenditure is an example of simple regression. In multiple regression analysis, we study more than two variables at a time in which one is dependent variable and others are independent variable. The study of effect of rain and irrigation on yield of wheat is an example of multiple regression. 2. Linear and Non-linear Regression: When one variable changes with other variable in fixed ratio this is called linear regression. When one variable varies with other variable in changing ratio, then it is called as non-linear regression. 3. Partial and Total Regression: When two or more variables are studied for functional relationship but at a time, relationship between two variables is studied and the other variables are constant, it is known as partial regression. When all the variables are studied simultaneously for the relationship among them is called total regression.

Simple Linear Regression Regression Lines Regression Equations Regression Coefficients

Regression Lines The Regression Line shows the average relationship between two variables. This is also known as the line of Best Fit. If two variables X and Y are given, then there are two regression lines related to them: Regression Line of X on Y: The regression line of X on Y gives the best estimate for the value of X for any given value of Y Regression Line of Y on X: The regression line of Y on X gives the best estimate for the value of Y for any value of X.

Method of obtaining Regression lines 1. Scatter diagram method 2. Least Square method

Scatter diagram method This is the simplest method of constructing regression lines. In this method, values of related variables are plotted on a graph. A straight line is drawn passing through the plotted points. The straight line is drawn with freehand. The shape of regression line can be linear or non-linear.

Least Square Method Regression Lines can also be constructed by this method. Under this method a regression line is fitted through different points in such a way that the sum of squares of the deviations of the observed values from the fitted line shall be least. The line drawn by this method is called Line of Best Fit. Under this method two lines regression lines are drawn in such a way that sum of squared deviations becomes minimum.

Regression Equations Regression Equation of Y on X Regression equation of X on Y Regression equations are the algebraic formulation of regression lines. Regression Equations represent regression lines. There are two regression equations:

Regression Equation of Y on X This equation is used to estimate the probable value of Y on the basis of the given values of X. This equation can be expressed in the following way: Y=a+bX Here a and b are constants Regression equation of Y on X can also be presented in another way: Y−Y¯=byx(X—X¯)

Regression Equation of X on Y This equation is used to estimate the probable values of X on the basis of the given values of Y. This equation is expressed in the following: X=a+bY Here a and b are constants Regression equation of X on Y can also be written in the following way: X—X¯=bxy(Y—Y¯¯)

Regression Coefficients There are two regression coefficients. Regression coefficients measures the average change in the value of one variable for q unit change in the value of another variables. Regression coefficient represents the slope of a regression line. There are two regression coefficients: Regression coefficient of Y on X Regression coefficient of X on Y

Regression coefficient of Y on X This coefficient shows that with a unit change in the value of X variable, what will be the average change in the value of Y variable. This is represented byx. Its formula is as follows: byx=r.y∕x Regression coefficient of X on Y This coefficient shows that with a unit change in the value of Y variable, what will be the average change in the value of X-variable. It is represented by bxy. Its formula is as follows: bxy=r.x∕y

Properties of Regression Coefficients Following are the properties of regression coefficients: 1. Coefficient of correlation is the geometric mean of the regression coefficients. r=√bxy×byx 2. Both the regression coefficients must have same sign- This means either both regression coefficients will either be positive or negative. In other words when one regression coefficient is negative, the other would also be negative. It is never possible that one regression coefficient is negative while the other is postive.

3. The coefficient of correlation will have the same sign as that of regression coefficients- If both regression coefficients are negative, then the correlation would be negative. And if byx and bxy have positive signs, then r will also take plus sign. 4. Both regression coefficients cannot be greater than unity- If one regression coefficient of y on x is greater than unity, then the regression coefficient of x on y must be less than unity. This is because r=√byx.bxy=±1

5. Arithmetic mean of two regression coefficients is either equal to or greater than the correlation coefficient. In terms of the formula: byx+bxy÷2≥r 6. Shift of origin does not affect regression coefficients but shift in scale does affect regression coefficient- Regression coefficients are independent of the change of origin but not of scale.

Methods to obtain Regression Equations Regression Equations in case of Individual Series Regression Equations in case of Grouped Data

In case of Individual Series In Individual series, regression equations can be worked out by two methods: Using Normal Equations Using regression Coefficients

Regression Equations using Normal Equations This method is also called Least Square Method. Under this method, computation of regression equations is done by solving two normal equations. Regression Equation of Y on X Y=a+bX Under least square method, the values of a and b are obtained by using the following two normal equations: ∑Y=Na+b∑X ∑XY=a∑X+b∑X² Solving these equations, we get the following value of a and b: bxy= N.∑XY−∑X.∑Y÷N.∑X²−(∑X)² a=Y-bX Finally a and b are put in the equation:

Regression Equation of X on Y Regression Equation of X on Y is expressed as follows: X=a+bY Under least Square Method the value of a and b are obtained by using the following normal equations: ∑X=Na+b∑Y ∑XY=a∑Y+b∑Y² Solving these equations we get the value of a and b The calculated value of a and b are put in the equation:

Example Calculate the regression equation of X on Y from the following by least square method X Y X² Y² XY 1 2 4 5 25 10 3 9 8 16 64 32 7 49 35 N=5 ∑X=15 ∑Y=25 ∑X²=55 ∑Y²=151 ∑XY=88

Solution X=a+bY The two normal equations are: ∑X=Na+b∑Y ∑XY=a∑Y+b∑Y² Regression Equation of X on Y X=a+bY The two normal equations are: ∑X=Na+b∑Y ∑XY=a∑Y+b∑Y² Substituting the values we get 15=5a+25b 88=25a+151b Multiplying 1 and 2 75=25a+125b − − − 13=26b b=13÷26=0.5

Putting the value of b in equation1 Therefore: X=0.5+0.5Y

Regression Equation using Regression Coefficients Regression equations can also be computed with the help of regression coefficients. For this we will have to find out, X‾, Y‾, byx, bxy from the data. Regression equations can be computed from the regression coefficients by following method: 1. Using actual values of X and Y 2. Using deviations from Actual Means 3. Using deviations from Assumed Means 4. Using r, x, y, and X‾, Y‾.

Using Actual values of X and Y series In this method , actual values of X and Y are used to determine regression equations. Regression equation are put in the following way: Regression Equation of Y on X Y-Y‾=byx(X-X‾) Using the actual values, the value of byx can be calculated as: byx=N.∑XY−∑X.∑Y÷N.∑X²−(∑X)² Regression Equation of X on Y X−X‾=bxy(Y−Y‾) Using the actual values bxy can be calculated as : bxy=N.∑XY−∑X.∑Y÷N.∑Y²−(Y)²

Example Obtain two lines of equations from the following : X Y XY X² 2 5 10 4 25 7 28 16 49 6 9 54 36 81 8 64 11 110 100 121 ∑X=30 ∑Y=40 ∑XY=266 ∑X²=220 ∑Y²=340

byx=N.∑XY−(∑X) (∑Y)÷N.∑X²−(∑X)² =5×266−30×40÷5×220−(30)² Regression coefficient of Y on X byx=N.∑XY−(∑X) (∑Y)÷N.∑X²−(∑X)² =5×266−30×40÷5×220−(30)² =1330−1200÷1100−900 =130÷200 =0.65 Regression coefficient of X on Y bxy=N.∑XY−∑X.∑Y÷N.∑Y²−(Y)² =5×266−30×40÷5×340−(40)² =1330−1200÷1700−1600 =130÷100 =1.30 Y−Y‾=byx(X−X‾) Y‾=∑Y÷N =40÷5 =8 X‾=∑X÷N =30÷5 =6

Regression equation of Y on X Y−Y‾=byx(X−X‾) Y−8=0.65(X− 6) Y=4.1+0.65X Regression equation of X on Y X−X‾=bxy(Y−Y‾) X−6=1.30(Y−8) X=−4.40+1.30Y

Using Deviations take from Actual Means When the size of the values of X and Y is very large, then the method using actual values becomes very difficult to use. In such case, deviations taken from arithmetic means are used. Regression equation of Y on X Y−Y‾=byx(X−X‾) Using deviations from actual means: byx=∑xy÷∑x² Where, x=X−X‾, y=Y−Y‾ Regression equation of X on Y X−X‾=bxy(Y−Y‾) bxy=∑xy÷∑y²

Example Obtain the two regression equations from the following: X Y X−X‾ x x² Y‾=5 Y−Y‾ y Y² xy 2 4 −5 25 −1 1 5 −3 9 6 8 10 3 −2 −6 12 ∑X=42 N=6 ∑Y=30 ∑x=0 ∑x²=70 ∑y=0 ∑y²=40 ∑xy=18

Solution Since the actual means of X and Y are whole numbers we should take deviations from X‾and Y‾ byx=∑xy÷∑x² =18÷70 =0.257 bxy=∑xy÷∑y² =18÷40 =0.45 Regression equation of Y on X Y−Y‾=byx(X−X‾) Y−5=0.257(X−7) Y−5=0.257X−7 Y=0.257X+3.201 Regression equation of X on Y X−X‾=bxy(Y−Y‾) X−7=0.45(Y−5) X−7=0.45Y−2.25 X=0.45Y=4.75

Using Deviations taken from Assumed Means When actual means turn out to be in fractions rather than the whole number like 24.14, 56.89 etc, then deviations from assumed means rather than actual means are used. Following are the regression equations: Regression equation of Y on X Y−Y‾=byx(X−X‾) Using deviations from the assumed means, value of byx can be calculated as: byx=N×∑dxdy−∑dx.∑dy÷N.∑dx²−(∑dx)²

Regression equation of X on Y X−X¯=bxy(Y−Y¯) Using deviations from assumed means the value of bxy can be calculated: bxy=N.∑dxdy−∑dx.∑dy÷N.∑dy²−(∑dy)² Where dx=X−Ax, dy=Y−Ay

Example Obtain the two regression equations for the following data: X Y A=69 dx dx² A=112 dy dy² dxdy 78 125 9 81 13 169 117 89 137 20 400 25 625 500 97 156 28 784 44 1936 1232 69=A 112=A 59 107 −10 100 −5 50 79 136 10 24 576 240 68 124 −1 1 12 144 −12 61 108 −8 64 −4 16 32 N=8 ∑X=600 ∑Y=1005 ∑dx=48 ∑dx²=1530 ∑dy=109 ∑dy²=3491 ∑dxdy=2159

Solution byx=N.∑dxdy−∑dx.∑dy÷N.∑dx²−(dx)² =8×2159−(48)(109)÷8×1530−(48)² =17272−5232÷12240−2304 =12040÷9936 =1.212 Regression equation of Y on X Y−Y¯=byx(X−X¯) Y−125.625=1.212(X−75) Y−125.625=1.212X−90.9 Y=1.212X+34.725

bxy=N.∑dxdy−∑dx.∑dy÷N.∑dy²−(∑ dy)² =8×2159−(48)(109)÷8×3491−(1005)² =17272−5232÷27928−1010025 =12040÷−9827097 =−0.001 Regression equation of X on Y X−X¯=bxy(Y−Y¯) X−75=−0.001(Y−125.625) X−75=−0.001Y+0.125 X=−0.001Y+9.375

Obtain Regression Equations from Coefficient of Correlation, Standard Deviations and Arithmetic Means of X and Y When the values of X¯, Y, x, y and r of X and Y series are given then regression equations are expressed in the following way: 1. Regression Equation Of Y on X Y−Y¯=byx(X−X¯) Where, byx=r.y÷x 2. Regression equation of X on Y X−X¯=bxy(Y−Y¯) Where, bxy=r.x÷y

Example You are given the following information: Obtain two regression equations: X Y Arithmetic Mean: 5 12 Standard deviation: 2.6 3.6 Correlation Coefficient: r=0.7

Solution 1. Regression Equation Of X on Y: X−X¯=r.x÷y(Y−Y¯) Putting the values in the equation we get: X−5=0.7×2.6÷3.6(Y−12) X−5=0.51(Y−12) X−5=0.51Y−6.12 X=0.51Y−1.12

2. Regression equation of Y on X Y−Y¯=r 2. Regression equation of Y on X Y−Y¯=r.y÷x(X−X¯) Putting the values in the equation, we get Y−12=0.7×3.6÷2.6(X−5) Y−12=0.97(X−5) Y−12=0.97X−4.85 Y=0.97X+7.15

Obtain Regression Equations in case of Grouped Data For obtaining regression equations from grouped data, first of all we have to construct a correlation table. Special adjustments must be made while calculating the value of regression coefficients because regression coefficients are independent of change of origin but not of scale. In grouped data the regression coefficients are computed by using the formula: 1.bxy=N×∑fdxdy−∑fdx.∑fdy÷N×∑fdy²−(∑fdy)²×ix÷iy 2.byx=N×∑fdxdy−∑fdx.∑fdy÷N×∑fdx²−(∑fdx)²×iy÷ix

Obtain the Mean values and Correlation Coefficient from the Regression Equations 1. To find the Mean Values from the Regression Equations: Two regression lines intersect each other at mean value(X¯ and Y‾ points. 2. To find the Coefficient of Correlation from two Regression Equations: Correlation coefficient can be worked out from the regression coefficients bxy and byx. r=√byx.bxy

Example From the following two regression equations, identify which one is X on Y and which one is Y on X 2X+3Y=42 X+2Y=26 Solution In the absence of any clear cut indication, Let us assume that equation first to be Y on X and equation second to be X on Y Let equation 1 be regression equation of Y on X 3Y=42−2X Y=42÷3−2X÷3

From this it follows that byx=Coefficient of X in 1=−2÷3 Now equation 2 be regression equation on X on Y X+2Y=26 X=26−2Y bxy=coefficient of Y in 2=−2 Now, we calculate ‘r’ on the basis of the above values of two regression coefficients we get: r²=byx.bxy =−2÷3×−2 = 4÷3>1

Here, r²>1 which is impossible as r²≤1. So our assumption is wrong Here, r²>1 which is impossible as r²≤1. So our assumption is wrong. We now choose equation 1 as regression of X on Y and 2 as regression equation of Y on X: Assuming the first equation as on X on Y, we have 2X+3Y=42 2X=42−3Y X=42÷2−3Y÷2 From this, it follows that bxy=Coefficient of Y in above equation=−3÷2 Now, assuming the second equation as Y on X, we have, X+2Y=26 2Y=26−X Y=26÷2−1÷2X From this it follows that byx=Coefficient of X in above equation=−1÷2

Now, r²=bxy.byx =−3÷2×−1÷2 =3÷4 =0.75 Here, r²<1 which is possible. r² is within the limit , r²≤1. Hence, it is proved that the first equation is of X on Y and the second equation is of Y on X