Presentation on theme: "PLAN 1. Correlation and regression with two variables 2. Using Excel to run regressions 3. Regression with three or more variables 4. The Japanese judiciary."— Presentation transcript:
PLAN 1. Correlation and regression with two variables 2. Using Excel to run regressions 3. Regression with three or more variables 4. The Japanese judiciary example: data problems and on/off variables 5. Practice using company financial data 6. The theory of social capital 7. Two studies of social capital: small and large datasets 8. Practice using state social data
PLAN: Second Day 1. Intro 2.Using Excel to run regressions 3. The Japanese judiciary example: data problems and on/off variables 4. Practice using state social data 5. Practice using company financial data
Good References on Econometrics in the Courtroom Frank Fisher, Multiple Regression in Legal Proceedings, 80 Columbia Law Review 702 (1980) Daniel Rubinfeld, Econometrics in the Courtroom, 86 Columbia Law Review 1048 (1985)
Heteroskedasticity The basic assumption is that each data point has the same error variance. Our model is PAY = a + b*IQ + e, and any divergence from PAY =a + b*IQ arises because of the error terms. Thus, PAY1= a + b*IQ1 + e1 PAY2 = a + b*IQ2 + e2 PAY3 = a + b*IQ3 + e3, and all these errors have the same variance. We estimate that variance using the observed data. But suppose you had 100 observations, and you knew that the first 50 had very high measurement error. That means the error variance is higher for those. It being different is called heteroskedasticity.
Heteroskedasticity II. Suppose you had 100 observations, and you knew that the first 50 had very high measurement error. That means the error variance is higher for those. You could just thrown away that data, but then you lose information. The best thing is to estimate the variance of the error in the first 50 observations separately from the error in the second 50 observations. Then, when you do your regression, weight the first 50 observations, the badly measured ones, less heavily. (Weighted least squares). In effect, the computer tries harder to make the regression line go close to your last 50, accurately measured, data points; and doesnt worry as much about being close to the first 50, poorly measured, data points.
Your consent to our cookies if you continue to use this website.