Regression Example Using Pop Quiz Data. Second Pop Quiz At my former school (Irvine), I gave a “pop quiz” to my econometrics students. The quiz consisted.

Slides:



Advertisements
Similar presentations
CHOW TEST AND DUMMY VARIABLE GROUP TEST
Advertisements

Christopher Dougherty EC220 - Introduction to econometrics (chapter 5) Slideshow: slope dummy variables Original citation: Dougherty, C. (2012) EC220 -
Christopher Dougherty EC220 - Introduction to econometrics (chapter 1) Slideshow: exercise 1.16 Original citation: Dougherty, C. (2012) EC220 - Introduction.
More on Regression Spring The Linear Relationship between African American Population & Black Legislators.
Heteroskedasticity The Problem:
HETEROSCEDASTICITY-CONSISTENT STANDARD ERRORS 1 Heteroscedasticity causes OLS standard errors to be biased is finite samples. However it can be demonstrated.
Lecture 9 Today: Ch. 3: Multiple Regression Analysis Example with two independent variables Frisch-Waugh-Lovell theorem.
Some Topics In Multivariate Regression. Some Topics We need to address some small topics that are often come up in multivariate regression. I will illustrate.
INTERPRETATION OF A REGRESSION EQUATION
Christopher Dougherty EC220 - Introduction to econometrics (chapter 3) Slideshow: exercise 3.5 Original citation: Dougherty, C. (2012) EC220 - Introduction.
Lecture 4 This week’s reading: Ch. 1 Today:
Valuation 4: Econometrics Why econometrics? What are the tasks? Specification and estimation Hypotheses testing Example study.
Introduction to Regression Analysis Straight lines, fitted values, residual values, sums of squares, relation to the analysis of variance.
1 Review of Correlation A correlation coefficient measures the strength of a linear relation between two measurement variables. The measure is based on.
1 Michigan.do. 2. * construct new variables;. gen mi=state==26;. * michigan dummy;. gen hike=month>=33;. * treatment period dummy;. gen treatment=hike*mi;
Sociology 601 Class 23: November 17, 2009 Homework #8 Review –spurious, intervening, & interactions effects –stata regression commands & output F-tests.
A trial of incentives to attend adult literacy classes Carole Torgerson, Greg Brooks, Jeremy Miles, David Torgerson Classes randomised to incentive or.
Interpreting Bi-variate OLS Regression
1 Zinc Data EPP 245 Statistical Analysis of Laboratory Data.
Sociology 601 Class 26: December 1, 2009 (partial) Review –curvilinear regression results –cubic polynomial Interaction effects –example: earnings on married.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 6) Slideshow: variable misspecification iii: consequences for diagnostics Original.
1 INTERPRETATION OF A REGRESSION EQUATION The scatter diagram shows hourly earnings in 2002 plotted against years of schooling, defined as highest grade.
Back to House Prices… Our failure to reject the null hypothesis implies that the housing stock has no effect on prices – Note the phrase “cannot reject”
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT This sequence describes the testing of a hypotheses relating to regression coefficients. It is.
SLOPE DUMMY VARIABLES 1 The scatter diagram shows the data for the 74 schools in Shanghai and the cost functions derived from a regression of COST on N.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 5) Slideshow: exercise 5.5 Original citation: Dougherty, C. (2012) EC220 - Introduction.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 3) Slideshow: precision of the multiple regression coefficients Original citation:
Christopher Dougherty EC220 - Introduction to econometrics (chapter 5) Slideshow: Chow test Original citation: Dougherty, C. (2012) EC220 - Introduction.
EDUC 200C Section 4 – Review Melissa Kemmerle October 19, 2012.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 5) Slideshow: dummy variable classification with two categories Original citation:
Christopher Dougherty EC220 - Introduction to econometrics (chapter 5) Slideshow: dummy classification with more than two categories Original citation:
DUMMY CLASSIFICATION WITH MORE THAN TWO CATEGORIES This sequence explains how to extend the dummy variable technique to handle a qualitative explanatory.
Confidence intervals were treated at length in the Review chapter and their application to regression analysis presents no problems. We will not repeat.
1 PROXY VARIABLES Suppose that a variable Y is hypothesized to depend on a set of explanatory variables X 2,..., X k as shown above, and suppose that for.
EXERCISE 5.5 The Stata output shows the result of a semilogarithmic regression of earnings on highest educational qualification obtained, work experience,
Returning to Consumption
Country Gini IndexCountryGini IndexCountryGini IndexCountryGini Index Albania28.2Georgia40.4Mozambique39.6Turkey38 Algeria35.3Germany28.3Nepal47.2Turkmenistan40.8.
How do Lawyers Set fees?. Learning Objectives 1.Model i.e. “Story” or question 2.Multiple regression review 3.Omitted variables (our first failure of.
MultiCollinearity. The Nature of the Problem OLS requires that the explanatory variables are independent of error term But they may not always be independent.
EDUC 200C Section 3 October 12, Goals Review correlation prediction formula Calculate z y ’ = r xy z x for a new data set Use formula to predict.
F TEST OF GOODNESS OF FIT FOR THE WHOLE EQUATION 1 This sequence describes two F tests of goodness of fit in a multiple regression model. The first relates.
MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE 1 This sequence provides a geometrical interpretation of a multiple regression model with two.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 1) Slideshow: exercise 1.5 Original citation: Dougherty, C. (2012) EC220 - Introduction.
. reg LGEARN S WEIGHT85 Source | SS df MS Number of obs = F( 2, 537) = Model |
Christopher Dougherty EC220 - Introduction to econometrics (chapter 5) Slideshow: exercise 5.2 Original citation: Dougherty, C. (2012) EC220 - Introduction.
Chapter 5: Dummy Variables. DUMMY VARIABLE CLASSIFICATION WITH TWO CATEGORIES 1 We’ll now examine how you can include qualitative explanatory variables.
(1)Combine the correlated variables. 1 In this sequence, we look at four possible indirect methods for alleviating a problem of multicollinearity. POSSIBLE.
COST 11 DUMMY VARIABLE CLASSIFICATION WITH TWO CATEGORIES 1 This sequence explains how you can include qualitative explanatory variables in your regression.
Lecture 5. Linear Models for Correlated Data: Inference.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 6) Slideshow: exercise 6.13 Original citation: Dougherty, C. (2012) EC220 - Introduction.
STAT E100 Section Week 12- Regression. Course Review - Project due Dec 17 th, your TA. - Exam 2 make-up is Dec 5 th, practice tests have been updated.
RAMSEY’S RESET TEST OF FUNCTIONAL MISSPECIFICATION 1 Ramsey’s RESET test of functional misspecification is intended to provide a simple indicator of evidence.
1 CHANGES IN THE UNITS OF MEASUREMENT Suppose that the units of measurement of Y or X are changed. How will this affect the regression results? Intuitively,
GRAPHING A RELATIONSHIP IN A MULTIPLE REGRESSION MODEL The output above shows the result of regressing EARNINGS, hourly earnings in dollars, on S, years.
1 BINARY CHOICE MODELS: LINEAR PROBABILITY MODEL Economists are often interested in the factors behind the decision-making of individuals or enterprises,
1 REPARAMETERIZATION OF A MODEL AND t TEST OF A LINEAR RESTRICTION Linear restrictions can also be tested using a t test. This involves the reparameterization.
1 In the Monte Carlo experiment in the previous sequence we used the rate of unemployment, U, as an instrument for w in the price inflation equation. SIMULTANEOUS.
F TESTS RELATING TO GROUPS OF EXPLANATORY VARIABLES 1 We now come to more general F tests of goodness of fit. This is a test of the joint explanatory power.
WHITE TEST FOR HETEROSCEDASTICITY 1 The White test for heteroscedasticity looks for evidence of an association between the variance of the disturbance.
1 COMPARING LINEAR AND LOGARITHMIC SPECIFICATIONS When alternative specifications of a regression model have the same dependent variable, R 2 can be used.
VARIABLE MISSPECIFICATION II: INCLUSION OF AN IRRELEVANT VARIABLE In this sequence we will investigate the consequences of including an irrelevant variable.
1 Estimating and Testing  2 0 (n-1)s 2 /  2 has a  2 distribution with n-1 degrees of freedom Like other parameters, can create CIs and hypothesis tests.
VARIABLE MISSPECIFICATION I: OMISSION OF A RELEVANT VARIABLE In this sequence and the next we will investigate the consequences of misspecifying the regression.
1 Experimental Statistics - week 11 Chapter 11: Linear Regression and Correlation.
QM222 Class 9 Section A1 Coefficient statistics
QM222 Class 8 Section A1 Using categorical data in regression
The slope, explained variance, residuals
QM222 Class 15 Section D1 Review for test Multicollinearity
Introduction to Econometrics, 5th edition
Introduction to Econometrics, 5th edition
Presentation transcript:

Regression Example Using Pop Quiz Data

Second Pop Quiz At my former school (Irvine), I gave a “pop quiz” to my econometrics students. The quiz consisted of 10 questions. The first five questions were trivia-type questions. The second five questions tested TV knowledge. The last question asked students to report GPAs.

First Five Questions Who is the Secretary of Defense? Who is the Speaker of the House? What is the capital of Brazil?

Second Five Questions On “The Simpsons” Who Owns the Quickie Mart? On “Malcom in the Middle” what is the name of Malcom’s older brother? Who recently (not so recent any more) left “The West Wing”? On “ER” who is the doctor from Croatia? On “Everybody Loves Raymond” what does Raymond do for a living?

My Favorite Answers Who is the Speaker of the House: George Bush Who is the Croatian from ER: Toni Kukoc What is your GPA? You don’t even want to know.

My Favorite Answers (Cont’d) What is the capital of Brazil?  Irvine Who is Malcom’s older Brother?  Justin Who recently left the West Wing?  Michael J. Fox.

Regression Example Compute Number Correct for Each Set of 5. Match Number Correct with Midterm Score Only Include Those Quizzes with some answers. See if Number Correct is Correlated with Midterm Performance. First Five (+), Second Five (-)?

Source | SS df MS Number of obs = F( 2, 54) = 2.88 Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = midterm | Coef. Std. Err. t P>|t| [95% Conf. Interval] right_1 | right_2 | _cons | The Number correct on the first 5 questions is a significant predictor of the midterm score. (Every additional question answered correctly is associated with a.82 point increase in the midterm score.) This coefficient is statistically significant at the 5 percent level, but not the one percent level.

The Number correct on the second 5 (TV questions) is not a significant predictor of midterm score. It seems to have no predictive power, and its t-statistic is very low. Source | SS df MS Number of obs = F( 2, 54) = 2.88 Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = midterm | Coef. Std. Err. t P>|t| [95% Conf. Interval] right_1 | right_2 | _cons |

Less than 10 percent of the variation in midterm scores is explained by variation in the number of questions answered correctly. Prediction: What is the expected midterm score for someone getting 0 questions correct: Ans: Just the Intercept: Prediction: Expected Score for Getting 2 correct on each section: Ans: (2) (2) =

Source | SS df MS Number of obs = F( 2, 54) = 2.88 Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = midterm | Coef. Std. Err. t P>|t| [95% Conf. Interval] right_1 | right_2 | XXXX _cons | Fill In Missing T-Statistic. The missing value is the t-statistic under the Null that the Coefficient on right_2 = 0. So, the t-statistic is ( )/.2917 =.357

Source | SS df MS Number of obs = F( 2, 54) = 2.88 Model | Prob > F = Residual | R-squared = Adj R-squared = Total | xxxxxxxxxx Root MSE = midterm | Coef. Std. Err. t P>|t| [95% Conf. Interval] right_1 | right_2 | _cons | Fill In Missing TSS. Since TSS = ESS + RSS, TSS = =

Source | SS df MS Number of obs = F( 2, 54) = 2.88 Model | Prob > F = Residual | R-squared = xxxxxx Adj R-squared = Total | Root MSE = midterm | Coef. Std. Err. t P>|t| [95% Conf. Interval] right_1 | right_2 | _cons | Fill In Missing R-squared value: R-squared is defined as the Model Sum of Squares (ESS) divided by TSS. So, R-squared is / =.0963

Source | SS df MS Number of obs = F( 2, 54) = 2.88 Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = midterm | Coef. Std. Err. t P>|t| [95% Conf. Interval] right_1 | xxxxxxxx xxxxxxxx right_2 | _cons | Fill In Missing Confidence Interval: The 5 percent critical value from t 57-3 or t 54 is (approximately) 2.00 So, the lower part of the interval is.8169 – 2.00(.3513) =.1141 And the upper part of the interval is (.3513) =

Source | SS df MS Number of obs = F( 2, 54) = 2.88 Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = midterm | Coef. Std. Err. t P>|t| [95% Conf. Interval] right_1 | right_2 | _cons | Given the results on the table, how could you estimate the variance parameter,  2 ? Sounds like an interesting test question...