Regression Analysis Using Excel. Econometrics Econometrics is simply the statistical analysis of economic phenomena Here, we just summarize some of the.

Slides:

Advertisements

Similar presentations

Managerial Economics in a Global Economy

Advertisements

Applied Econometrics Second edition

Chap 12-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 12 Simple Regression Statistics for Business and Economics 6.

Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 ~ Curve Fitting ~ Least Squares Regression Chapter.

Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 12 l Multiple Regression: Predicting One Factor from Several Others.

Econ 140 Lecture 81 Classical Regression II Lecture 8.

Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.

Simple Linear Regression. Start by exploring the data Construct a scatterplot  Does a linear relationship between variables exist?  Is the relationship.

Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.

Objectives (BPS chapter 24)

Chapter 13 Multiple Regression

Chapter 12 Simple Regression

Nonlinear and multiple regression analysis Just a straightforward extension of linear regression analysis.

The Simple Regression Model

Examining Relationship of Variables  Response (dependent) variable - measures the outcome of a study.  Explanatory (Independent) variable - explains.

Chapter 3 homework Numbers 6, 7, 12 Review session: Monday 6:30-7:30 Thomas 324.

THE IDENTIFICATION PROBLEM

1 1 Slide © 2003 South-Western/Thomson Learning™ Slides Prepared by JOHN S. LOUCKS St. Edward’s University.

Chapter 14 Introduction to Linear Regression and Correlation Analysis

Lecture 15 Basics of Regression Analysis

1 Doing Statistics for Business Doing Statistics for Business Data, Inference, and Decision Making Marilyn K. Pelosi Theresa M. Sandifer Chapter 11 Regression.

Inference for regression - Simple linear regression

3 CHAPTER Cost Behavior 3-1.

1 1 Slide © 2004 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.

Statistics for Business and Economics 7 th Edition Chapter 11 Simple Regression Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch.

Chap 12-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 12 Introduction to Linear.

Multiple regression - Inference for multiple regression - A case study IPS chapters 11.1 and 11.2 © 2006 W.H. Freeman and Company.

Class 4 Simple Linear Regression. Regression Analysis Reality is thought to behave in a manner which may be simulated (predicted) to an acceptable degree.

1 Chapter 12 Simple Linear Regression. 2 Chapter Outline  Simple Linear Regression Model  Least Squares Method  Coefficient of Determination  Model.

Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.

Basic Concepts of Correlation. Definition A correlation exists between two variables when the values of one are somehow associated with the values of.

MGS3100_04.ppt/Sep 29, 2015/Page 1 Georgia State University - Confidential MGS 3100 Business Analysis Regression Sep 29 and 30, 2015.

Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.

10B11PD311 Economics REGRESSION ANALYSIS. 10B11PD311 Economics Regression Techniques and Demand Estimation Some important questions before a firm are.

© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.

Chapter 13 Multiple Regression

STA 286 week 131 Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression.

The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 3 Describing Relationships 3.2 Least-Squares.

ANOVA, Regression and Multiple Regression March

Essentials of Business Statistics: Communicating with Numbers By Sanjiv Jaggia and Alison Kelly Copyright © 2014 by McGraw-Hill Higher Education. All rights.

EXCEL DECISION MAKING TOOLS AND CHARTS BASIC FORMULAE - REGRESSION - GOAL SEEK - SOLVER.

Bivariate Regression. Bivariate Regression analyzes the relationship between two variables. Bivariate Regression analyzes the relationship between two.

Chapter 13 Simple Linear Regression

The simple linear regression model and parameter estimation

Chapter 4: Basic Estimation Techniques

Chapter 14 Introduction to Multiple Regression

Chapter 4 Basic Estimation Techniques

Regression Analysis AGEC 784.

Correlation and Simple Linear Regression

Basic Estimation Techniques

Chapter 11 Simple Regression

Relationship with one independent variable

Correlation and Simple Linear Regression

Basic Estimation Techniques

Correlation and Regression

CHAPTER 29: Multiple Regression*

CHAPTER 3 Describing Relationships

Correlation and Simple Linear Regression

Undergraduated Econometrics

Relationship with one independent variable

Simple Linear Regression and Correlation

CHAPTER 3 Describing Relationships

CHAPTER 3 Describing Relationships

CHAPTER 3 Describing Relationships

CHAPTER 3 Describing Relationships

Introduction to Regression

Correlation and Simple Linear Regression

Correlation and Simple Linear Regression

Presentation transcript:

Regression Analysis Using Excel

Econometrics Econometrics is simply the statistical analysis of economic phenomena Here, we just summarize some of the concepts of basic econometrics and primarily relate the econometric estimation of economic phenomena using Excel We discuss some basics of estimating demand, production functions, supply and cost functions

Brief instruction on estimating a demand function Suppose there exists underlying data on the relation between a dependent variable, Q, and an explanatory or independent variable, X Suppose we have 6 data points of pairs of Q values and X values as shown in the following graph

Data points in Y and X, some of which are not on the straight line Y X A B C D E F Some of the points do not lie on a straight line, or even a smooth curve So there is not a direct relationship between Y and X as shown in the graph The job of the econometrician is to find a line or curve that best fits the data points – to assess the relationship between Y and X

Suppose a linear relationship between Y and X Y X A B C D E F There is some random variation in this linear relationship as seen by the data points relative to the straight line Mathematically, this variation implies that the linear relationship between Y and X is given by Y = a + bX + Є

a and b are unknown parameters that define the linear relationship with a being the Y axis intercept, and b reflecting the slope of the linear line Y X A B C D E F Because the parameters between Y and X are unknown, the econometrician must find out the values of the parameters, a and b Y = a + bX Note that for any line drawn through the points in the graph, there will be some discrepancy between the actual points and the line -- points A and D lie above the line, & points C and E lie below the line

Y X A B C D E F The deviations between the actual points on the line are given by the distance of the dashed lines from the line to the points, and these deviations are given as e ^ A, e^ C, e ^ D and e ^ E Y = a + bX The econometrician finds the values of the parameters a and b that minimize the sum of squared deviations between actual points and the line e^Ae^A e^Ce^C e^De^D e^ E The line, Y = a + bX represents the expected, or average, relationship between Y and X --- so the deviations are analogous to the deviations from the mean used to calculate the variance of a random variable

Y X A B C D E F The regression line then is the line that minimizes the squared deviations between the line (the expected relation) and the actual data points, A, B, C, D, E and F --- the values of the parameters a and b which are frequently denoted as a ^ and b ^ are called the parameter estimates Y = a + bX + Є The parameter estimates, a ^ and b ^ are the values of a and b that result in the smallest sum of squared errors between a line and the actual data e^Ae^A e^Ce^C e^De^D e^ E The corresponding line is called the least squares regression line for the equation Y = a + bX + Є is given by Y = a ^ + b ^ X

Software for regression analysis There are many software packages that actually are coded to derive the least squares regression line These packages are developed to provide various kinds of parameter estimates given alternative models of relationships, such as economic relationships Spreadsheet packages are available to do basic regression analysis, such as the use of Excel

Let’s look at some data for a product sold in a Chinese market and the price associated with each quantity sold observation Quantity Price

Now, let’s use the regression command in Excel to find the least squares estimates of demand The regression tool is found in TOOLS/Data Analysis/Regression in Excel Click on TOOLS, then click Data Analysis and then click on Regression A pool-down box is displayed You are asked to provide cell address range for Y, the dependent variable You are asked to provide the cell address range for X (and you can have multiple independent variables, so you can input a matrix of X variables) You can input labels along with your cell address ranges if the label for each variable is included at the top of the columns containing the Y variable and the X variables --- just click labels

you can have the regression algorithm compute and plot residuals, (Y – Y ^ ), which is a plot of the Y minus Y ^ = a ^ + b ^ X + Є Compute standardized residuals And adjust the output that results in the analysis of variance table and information

Once the data are loaded to the Regression package by populating the pool-down menu Then click ok And the output of the regression provides alternative regression statistics, and an analysis of variance table which provides information about the estimates a ^ and b ^ and there significance relative zero --- where a parameter is insignificant, meaning the independent variable X has no influence on the dependent variable Y, in the hypothesized relationship Y = a + bX

Let’s go back to our demand data We had 12 observations on quantity purchased and the price in Yuan for each quantity purchased 12 observations is quite minimal to have statistical robustness qualities in the estimates, but the number of observations is sufficient for this simple example problem

The data are given below --- we anticipate that the demand relationship (from economic theory) will yield Y = a + bX, where the estimate for b is b < downward sloping demand observation Quantity Price

Input the data Input the quantity purchased by stating the cell range of the quantity data into the Y-input window Input the price data by stating the cell range of the price data into the X-input window One can click in each window, and then drag down the cells in each respective Y, then X column of data to input the cell ranges Our data are set up with labels in the first row of the cell range, so we can click labels and we will get the names of the variable printed out in the analysis of variance table in the resulting output Then click OK --- to get regression estimate results

Here we show our data, and then the output of the regression analysis on the data --- regression statistics such as R 2, adjusted R 2, standard error of the estimate --- analysis of variance table giving us the estimate of the intercept term, a ^ = , and the slope of the demand curve, b ^ = (rounded)

We also get the sums of squares summary and the F-test which is a test of whether the model as a whole is significant, i.e., Y and X are related in the manner estimated --- downward sloping demand and positive intercept for the Y axis

Similarly, we obtain the standard error of each estimated parameter,  a^,  b^ and the t-ratio = t a^ = a ^ /  a^ for estimate a^, and t-ratio = t b^ = b^/  b^ The t-ratio of a parameter estimate is the ratio of the value of the parameter estimate to its standard error When the t-ratio is large in absolute value, then one can be confident that the true parameter is not equal to zero -- - there is a relationship between X and Y in this case --- for our number of observations, a t-ratio of |1.96| is sufficient

P-values are also reported --- these are a more precise measure of statistical significance, i.e., the estimated parameter is greater than zero in absolute value --- the p-value for price is approximately, meaning there is only a 7 in 100,000 chance that the true parameter for b (coefficient of price) is actually zero and is insignificant in explaining movements in quantity of purchases in our estimated demand equation--- the lower the p-value the more confident you are in the particular estimate Usually p-values of 0.05 or lower are considered low enough for a researcher to be confident in the value of the estimated parameter

The estimated R 2 = [explained variation] /[total variation] = sums of squares regression/ sums of squares total 0 <= R 2 <= 1 is the range of the R 2 The closer R 2 is to 1 the better the overall fit of the estimated regression equation to the actual data and relationship of Y to X So R 2 measures goodness of fit Adjusted R 2 = 1 – (1- R 2 ){(n-1)/(n-k)} with n = the total number of observations, and k = the number of estimated parameters n – k represents the residual degrees of freedom after conducting the regression analysis --- the adjusted R 2 penalizes regressions with only a few degrees of freedom { estimating numerous coefficients with small n}

The alternative measure of goodness of fit is the F-statistic, which provides a measure of the total variation explained by the regression relative to the total unexplained variation --- the greater the F-statistic, the better the overall fit of the regression line through the actual data The R 2 measure of goodness of fit can not tell us a rule for how high the measure should be to indicate good fit ---- the F- statistic does not suffer from this shortcoming ---- looking at Significance F tells us the significance of the F-statistic --- here the value is approximately, or there is only a 0.07 % chance the model fits the data purely by accident The lower the significance value of the F-statistic, the more confident one is in the actual regression model estimate