Fundamentals of regression analysis

Slides:



Advertisements
Similar presentations
Managerial Economics in a Global Economy
Advertisements

Multivariate Regression
The Multiple Regression Model.
Forecasting Using the Simple Linear Regression Model and Correlation
Inference for Regression
Studenmund(2006): Chapter 8
Objectives (BPS chapter 24)
Multiple Linear Regression Model
1-1 Regression Models  Population Deterministic Regression Model Y i =  0 +  1 X i u Y i only depends on the value of X i and no other factor can affect.
Chapter 10 Simple Regression.
4. Multiple Regression Analysis: Estimation -Most econometric regressions are motivated by a question -ie: Do Canadian Heritage commercials have a positive.
The Simple Regression Model
CHAPTER 4 ECONOMETRICS x x x x x Multiple Regression = more than one explanatory variable Independent variables are X 2 and X 3. Y i = B 1 + B 2 X 2i +
Pengujian Parameter Koefisien Korelasi Pertemuan 04 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil 2007/2008.
Chapter Topics Types of Regression Models
Multicollinearity Omitted Variables Bias is a problem when the omitted variable is an explanator of Y and correlated with X1 Including the omitted variable.
Chapter 11 Multiple Regression.
Topic 3: Regression.
Ekonometrika 1 Ekonomi Pembangunan Universitas Brawijaya.
© 2000 Prentice-Hall, Inc. Chap Forecasting Using the Simple Linear Regression Model and Correlation.
Introduction to Regression Analysis, Chapter 13,
Simple Linear Regression Analysis
Ordinary Least Squares
Introduction to Linear Regression and Correlation Analysis
Regression Analysis Regression analysis is a statistical technique that is very useful for exploring the relationships between two or more variables (one.
Inference for regression - Simple linear regression
Chapter 11 Simple Regression
Hypothesis Testing in Linear Regression Analysis
Understanding Multivariate Research Berry & Sanders.
2-1 MGMG 522 : Session #2 Learning to Use Regression Analysis & The Classical Model (Ch. 3 & 4)
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
© 2003 Prentice-Hall, Inc.Chap 13-1 Basic Business Statistics (9 th Edition) Chapter 13 Simple Linear Regression.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
Lesson Multiple Regression Models. Objectives Obtain the correlation matrix Use technology to find a multiple regression equation Interpret the.
Environmental Modeling Basic Testing Methods - Statistics III.
I271B QUANTITATIVE METHODS Regression and Diagnostics.
Class 5 Multiple Regression CERAM February-March-April 2008 Lionel Nesta Observatoire Français des Conjonctures Economiques
1 1 Slide The Simple Linear Regression Model n Simple Linear Regression Model y =  0 +  1 x +  n Simple Linear Regression Equation E( y ) =  0 + 
Linear Regression ( Cont'd ). Outline - Multiple Regression - Checking The Regression : Coeff. Determination Standard Error Confidence Interval Hypothesis.
MULTICOLLINEARITY: WHAT HAPPENS IF THE REGRESSORS ARE CORRELATED?
Lecturer: Ing. Martina Hanová, PhD..  How do we evaluate a model?  How do we know if the model we are using is good?  assumptions relate to the (population)
Chapter 20 Linear and Multiple Regression
Inference for Least Squares Lines
Statistics for Managers using Microsoft Excel 3rd Edition
Kakhramon Yusupov June 15th, :30pm – 3:00pm Session 3
REGRESSION DIAGNOSTIC II: HETEROSCEDASTICITY
Multiple Regression Analysis: Inference
THE LINEAR REGRESSION MODEL: AN OVERVIEW
Multivariate Regression
REGRESSION DIAGNOSTIC I: MULTICOLLINEARITY
Chapter 11 Simple Regression
Chapter 3: TWO-VARIABLE REGRESSION MODEL: The problem of Estimation
Fundamentals of regression analysis 2
CHAPTER 29: Multiple Regression*
Chapter 6: MULTIPLE REGRESSION ANALYSIS
LESSON 24: INFERENCES USING REGRESSION
Multiple Regression Models
PENGOLAHAN DAN PENYAJIAN
REGRESSION DIAGNOSTIC I: MULTICOLLINEARITY
Simple Linear Regression
Multicollinearity Susanta Nag Assistant Professor Department of Economics Central University of Jammu.
Chapter 7: The Normality Assumption and Inference with OLS
BEC 30325: MANAGERIAL ECONOMICS
Chapter 13 Additional Topics in Regression Analysis
Linear Regression Summer School IFPRI
Multicollinearity What does it mean? A high degree of correlation amongst the explanatory variables What are its consequences? It may be difficult to separate.
Financial Econometrics Fin. 505
BEC 30325: MANAGERIAL ECONOMICS
Presentation transcript:

Fundamentals of regression analysis Obid A.Khakimov

The essence of OLS The main logic of OLS is to find such kind of parameters of the regression which yield the minimum sum of squared errors. OLS is based on number of assumptions about e, error term and X, the primarily reason for these assumptions is that we do not know how the data is generated or created.

Assumptions. Linearity. The relationship between independent and dependent variables is linear. Full Rank. There is no exact relationship among any independent variables. Exogeneity of independent variables. The error term of the regression is not a function of independent variables. Homoscedastisity and no Autocorrelation. Error term of the regression is independently and normally distributed with zero means and constant variance. Normality of Error term

Presentation of the regression analysis results Are the signs of the estimated coefficients in accordance with the theory ? Are the estimated coefficients statistically significant ? How well the regression model explain the variation in dependent variable ? Does the model satisfy the assumptions of CLNRM ?

Classical theory of statistical inference Statistical inference is concerned with how we draw conclusions about large population from which sample is selected. Estimation and hypothesis testing constitute two branches of inference.

Hypothesis test How reliable are the estimates ? Hypothesis testing can take two forms, namely confidence interval estimation or test of significance Pr[b2est – tα/2se (b2es) <= b <= Pr[b2est + tα/2se (b2es) The reliability of point estimation measured by its standard error. Instead of relaying on point estimate we construct an interval around point estimate, say within two or three standard errors which will include with say 95% confidence the true parameter value.

Confidence interval estimation Pr[b2est – tα/2se (b2es) <= b <= Pr[b2est + tα/2se (b2es) The reliability of point estimation measured by its standard error. Instead of relaying on point estimate we construct an interval around point estimate, say within two or three standard errors which will include with say 95% confidence the true parameter value.

Linear models: Typical linear models in matrix form

Linear models:

Logic of OLS To obtain estimator where (Best linear unbiased and efficient)

Multicollinearity: reasons Data collection process Constraints on model or in the population being sampled. Model specification An over-determined models

Perfect v.s less than perfect Perfect multicollinearity is the case when two ore more independent variables Can create perfect linear relationship. Perfect multicollinearity is the case when two ore more independent variables Can create less than perfect linear relationship.

Practical consequences The OLS is BLUE but large variances and covariances making process estimation difficult. Large variances cause large confidence intervals and accepting or rejecting hypothesis are biased. T statistics are biased Although t-stats are low, R-square might be very high. The sensitivity of estimators and variances are very high to small changes in dataset.

Ha: Not all slope coefficients are simultaneously zero Due to low t-stats we can not reject our Null Hypothesis Ha: Not all slope coefficients are simultaneously zero Due to high R square the F-value will be very high and rejection of Ho will be easy

Detection How to detect : Multicollinearity is a question of degree. It is a feature of sample but not population. How to detect : High R square but low t-stats. High correlation coefficients among the independent variables. Auxiliary regression High VIF Eigenvalue and condition index.***

Auxiliary regression Ho: The Xi variable is not collinear Run regression where one X is dependent and other X’s are independent and Obtain R square Df num = k-2 Df denom = n-k+1 k- is the number of explanatory variables including intercept. n- is sample size. If F stat is higher than F critical then Xi variable is collinear Rule of thumb: if R square of auxiliary regression is higher than over R square then it might be troublesome.

What to do ? Do nothing. Combining cross section and time series Transformation of variables (differencing, ratio transformation) Additional data observations.

Assumption: Homoscedasticity or equal variance of ui X Y f(u)

Reasons: Error learning models; Higher variability in independent variable might increase higher variability in dependent variable. Spatial Correlation. Data collecting biases. Existence of extreme observations (outliers) Incorrect specification of Model Skewness in the distribution