2Review—Bivariate Regression What is the criterion that OLS uses to “fit” a line to your data?What is a parameter? A parameter estimate?What are independent variables—or rhs (right-hand-side) variables? Dependent variables?What is the slope? The intercept? The error term (in the population) or the residual (in the sample)?
3Review—Bivariate Regression Review of notation—slope, intercept, error (estimated or sample VS. true or population)Two possible consequences of violating an OLS assumption—”bias” (what does that mean?) and inflated / deflated standard errors (what does that mean?)
4Review—Bivariate Regression Assumptions:No measurement errorSpecification—include all relevant rhs variables, no irrelevant rhs variables, linear relationshipIs this likely what our data look like?Homeskedastic error terms (no heteroskedasticity)No autocorrelation
5Bivariate Regression: Robustness A discussion about standard errors, p-values, ad statistical significanceConfidence intervals. What are they?Confidence intervals are a range in which you would expect the true parameter to fall a pre-specified percentage of the time.
6Bivariate Regression: Robustness The wider the confidence interval, the less certain you are of the estimate. (A relatively wide confidence interval means that you could gather another sample, and would not be confident that your new slope estimate would be relatively close to the one you have from this sample).The wider the confidence interval....The higher the p-value (farther away from .05 or .01)...The less statistically significant the results....The less confident you are that there is a non-zero effect of the independent variable on the dependent variable
7Bivariate Regression: Robustness The narrower the confidence interval, the more “robust”, “efficient”, “stable” the results are.If you were to gather an infinite number of samples from your population, and calculate an infinite number of slope estimates (one from each sample), you dno’t expect that the slope estimate will change much from sample to sample...
8Bivariate Regression: Robustness The standard error and variance of the slope estimate....is a closely related concept. The larger the standard error, the less confident you are in your results (and the wider your confidence interval). Recall the image of the seesaw.
9Bivariate Regression: Robustness Let’s start with the variance of the residual. The formula for the variance of the residual (or the estimated variance of the error term):
10Bivariate Regression: Robustness Note two elements of that equation--First, what (by assumption) is the average residual?And second, why are we subtracting 2 from our sample size?
11Bivariate Regression: Robustness What is the variance of the slope estimate?
12Bivariate Regression: Robustness Note the numerator of the variance of the slope estimate b: it taps into the variance of the residuals (or, how well the data “fit” your estimated line)Note the denominator of the variance of the slope estimate b: it taps into the range or variance of X.
13Bivariate Regression: Robustness So, if the data fit your line well....The numerator of that equation is reducedThe variance of the slope estimate b is reduced; your results are more stableThe confidence interval for β is narrowerYour results are more statistically significant; your p-value is relatively low.You are more likely to reject the null hypothesis that β=0
14Bivariate Regression: Robustness And, if you have relatively good variance in X...The denominator of that equation is increasedThe variance of the slope estimate b is reduced; your results are more stableThe confidence interval for β is narrowerYour results are more statistically significant; your p-value is relatively low.You are more likely to reject the null hypothesis that β=0
15Bivariate Regression: Robustness The equation above is for the variance of your estimated slope b.Your computer printouts will generally give you the standard error of b.How do we calculate a standard error based on a variance?
16Bivariate Regression: Robustness The confidence interval for b is analogous to the variance and standard error, as noted above...
17Bivariate Regression: Robustness So, the slope estimate plus / minusThe t-value * the standard error of the slope estimateWhat is α? (It is 100-CL. Our CL is predetermined).Why are we dividing α by 2?Where can we find the t-value?How do we interpret a confidence interval?
18Bivariate Regression: T values Recall the central limit theorem, which said that for any population with known mean μ and known variance σ2, random samples can be drawn, and the means of these samples will be
19Bivariate Regression: T values We use the t distribution in probability testing. Suppose X is some Random Variable with a true mean of μ and a true variance of σ2. Of course, in “real life”, we never know these ‘true” values. We estimate μ withAnd we don’t know σ2, so we estimate it with s2. So instead of saying that we approximate a normal distribution, we say....
20Bivariate Regression: T values And we don’t know σ2, so we estimate it with s2. So instead of saying that we approximate a normal distribution, we say....~ Tn-1
21Bivariate Regression: T values As n gets larger, T distribution is closer to N(0,1) distribution; the mean of the t distribution is always 0, and as n increases, the variance of the t-distribution shrinks to 1.~ Tn-1
22Bivariate Regression: T values Our t value is calculated by setting μ to a hypothesized value (usually 0), and then taking our sample estimate, and dividing it by the standard error. Note that this corresponds to the formula below (although note that instead of the mean of X, we will be using “b” as our sample / estimated slope)~ Tn-1
23Bivariate Regression: Hypothesis Testing If our 1- α confidence interval includes zero, then we do not reject (we “fail to reject” our null hypothesisH0: β=0 at the 1- α level (2 tailed test).Of course, if our 1- α CI does not include zero, then we accept H1: β ≠ 0 (we do reject H0: β=0) at the 1- α Confidence Level.
24Bivariate Regression: Prob-Values Prob values for slope coefficient are analogous to confidence intervals. Most computer packages will report these p-values for each slope coefficient. The universal decision rule indicates that weReject H0: β=0 with (1-α) confidence if p-value < α
25Bivariate Regression: Prob-Values Pre-DeterminedCI1-ααreportedp-valueIs p-value< α?Conclusion90%1-.10.10.0374YesReject Ho95%1-.05.0598%1-.02.02NoFail to Reject Ho99%1-.01.01
26Bivariate Regression: One-tailed versus two-tailed tests. We use one-tailed tests when we have a directional hypothesis.One tailed tests make parameter estimates more significant, because you are restricting H1 to a narrower set of possibilities.In confidence intervals, the α remains the same, because you’ve picked a pre-determined confidence level—but you can think of the p-value (the area under the curve that represents greater than t) as being halved.
27Bivariate Regression: Summary We are estimating slopes and intercepts—and so we talk about the degree of confidence we have, based on our sample slopes and intercepts.That concept of “confidence”, “robustness”, “efficiency”, “stability” is part of inferential statistics.And, in general, a better fit of the data to the model—and more variance in the explanatory / independent variables – tends to make the findings more robust.
28Bivariate Regression: Summary This makes sense—if you only ask a couple of people who they are voting for in the Democratic / Republican primaries, you will not have much variance—and you would not be confident if you tried to generalize to a larger population.And research problems where there isn’t much variance in the independent variables (or dependent variable), and where the dependent variable is a “rare event” are just inherently difficult to predict (although there are ways to weight the observations so that one can address those issues).
29Bivariate Regression: Summary Likewise, we are always going to be thinking about two possible problems—we can have deflated or inflated standard errors if we are violating OLS assumptions (so, our results are more or less significant than they would otherwise be).Or our results are biased, which means that the estimated slope would not average out to the true slope, even if one collected an infinite number of samples, and an infinite number of estimate slopes.
30Bivariate Regression: Summary These concepts—of confidence and bias—also carry over to all inferential methods.And, keep in mind that there is a difference between statistical significance (as signaled by p-values or t-values or confidence intervals) and the magnitude of b.You may have a very small effect, but it is “statistically significant” because it is very robust (remember what goes into the t--
31Bivariate Regression: Summary You may have a very small effect, but it is “statistically significant” because it is very robust (remember what goes into the t—the value of b, divded by the standard error of b).Or, you may have a very large effect, but cannot conclude that it is different from 0, because it is not very robust—you’re not that sure it would be large if you collected a different sample.These concepts, too, carry over across methods. It is very important to interpret both statistical significance and magnitude, and to recognize they are not the same.
32Bivariate Regression: Residuals True Model:yi = σ + βxi + εiWhich is estimated with:Yi = a + bXi + eiεi is the true error term for observation i. “e” is the estimated error (residual) for observation i.
33Bivariate Regression: Residuals So,ei = yi – (a + bxi)Orei = observed Y – predicted YThink of the error term in the population as not an error, but as a disturbance or a stochastic shock, whose deviation from the “true” population line is due to randomness.
34Bivariate Regression: R2 Notice that observed Y – Mean Y = total deviation of Yi from mean Y.Notice that predicted Y – mean of Y is the deviation of Yi from the mean of Y explained by OLS regression lineAnd notice that observed Y – predicted Y is the remaining unexplained deviation of Yi from mean Y (error)
35Bivariate Regression: R2 Case(i)OrderingTotal(observed-mean)Explained(predicted - mean)Unexplained(Error)(observed-predicted)1P < O < M2O < P < M3P < M < O4O < M < P5M < P < O6M < O < P
36Bivariate Regression: R2 Of course, in any sample we have n data points—so we’ll have n total deviationsAnd n explained deviationsAnd n unexplained deviations
37Bivariate Regression: R2 Suppose we square each individual total, explained, and unexplained deviation.And then we sum up all of the squared total deviations, do the same for all the squared explained deviations, and the same for all the squared unexplained deviations.We would see thatSum of the Squared Total Deviations =Sum of the Squared Explained Deviations +Sum of the Squared Unexplained Deviations
38Bivariate Regression: R2 OR,TSS = Total Sum of Squares =RSS (Regression Sum of Squares / Explained Sum of Squares)+ESS (Error Sum of Squares / Residual Sum of Squares)
39Bivariate Regression: R2 And R2 = RSS / TSS(Note that this is the same as 1 – proportion of total deviation (TSS) of Y from the mean that is “unexplained” by OLS)So, if R2 = .34, we can say that 34% of the total variation in Y has been accounted for by the OLS regression of Y and X
40Bivariate Regression: R2 Why is R2 useful?What are the limits of R2?It is not really a measure of magnitude of the effect.It is a measure of correlation, and so it depends in part on the standard deviation of X and Y—and cannot be compared across samples.Models with high R2 are not necessarily “good”—and models with low R2 are not necessarily “bad”.
41Bivariate Regression: R2 It is not really a measure of magnitude of the effect.It is a measure of correlation, and so it depends in part on the standard deviation of X and Y—and cannot be compared across samples.Models with high R2 are not necessarily “good”—and models with low R2 are not necessarily “bad”.R2 can be biased, particularly in small samples.And it can be a reflection of the number of variables on the left hand sides (although there are ways to account for this—adjusted R2)
42Bivariate Regression: R2 The bottom line? – R2 is a measure of goodness of fit, and as such can be useful. It is not a measure of how good your results are.And, when you think about it, what the R2 is doing is telling you how well the data fit the line—how good your prediction is compared to just using the mean. The mean isn’t a great predictor of Y, so the utility of the R2 is limited.
43Bivariate Regression: Standard Error of Estimate