Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to Statistics: Political Science (Class 2) Central Limit Theorem, T-statistics, and using split sample analysis and multivariate regression.

Similar presentations


Presentation on theme: "Introduction to Statistics: Political Science (Class 2) Central Limit Theorem, T-statistics, and using split sample analysis and multivariate regression."— Presentation transcript:

1 Introduction to Statistics: Political Science (Class 2) Central Limit Theorem, T-statistics, and using split sample analysis and multivariate regression to deal with confounds

2 Today… A review of what standard errors and T- statistics tell us Multivariate regression

3 The goal of statistical analysis? We want to know: *true* “population” mean or relationship What we have: sample of the units we are interested in Thus we estimate the mean or relationship –What is an estimate?

4 Actually we estimate 2 things Estimate of mean or relationship –We know how to get this (calculate the mean or find the best fit line) Estimate of uncertainty –Often (typically?): How confident can we be that a mean or relationship is not zero –We can’t measure our uncertainty directly (we’re uncertain – duh!)

5 In repeated sampling (if we redrew over and over and over and recalculated)… – the average of the estimates will be centered on the population (“true”) mean –the distribution of estimates will be approximately normal… The Central Limit Theorem

6 Like this This width depends on: 1.Variance in population (more  wider) 2.Number of cases sampled (more  narrower)

7 Coin toss

8 Mean ideology of the American public How would you rate yourself on the following scale? 1.Very Liberal 2.Liberal 3.Somewhat Liberal 4.Middle of the Road 5.Somewhat Conservative 6.Conservative 7.Very Conservative If we were omniscient (or could ask every single person) we would know that the true average is 5.0 but we’re not/we can’t… Instead we call 100 people at random… and then we do that again and again…

9 Estimating Mean Ideology SampleMeanSE LB (Mean-2SEs) UB (Mean+2SEs) In any given sample we would be about 95% confident that the true population mean was somewhere within this range

10 One Standard Error 5.0 Another way to think about this is that 95% of the time, our estimates of the mean will be within about +/- two standard errors of the population value

11 Same idea with regression coefficient If we were able to redraw new samples over and over and re-estimate β… Typically (always for our purposes here) we’re testing whether a coefficient = 0 Coef SE Coef T P Democracy Scores Constant So T can be thought of as: how many SEs from 0 that the coefficient is

12 0 If the true relationship was 0 (no relationship), getting an estimated coefficient with a T-value with an absolute value greater than by chance would be extremely unlikely (about 1 in 1,000,000,000,000,000,000,000,000,000,000) So we can be confident rejecting the null hypotheses (What’s the null? Why do we set things up this way?) T = 11.34T =

13 1 v. 2-tailed tests 1-tailed: You have strong prior expectations about direction of relationship (if relationship turns out to be in the other direction you can’t reject the null – even w/a large t-statistic) 2-tailed: No strong priors about direction of relationship – more conservative test

14 Causal relationships Identifying associations is nice, but usually we want to identify causality Two primary threats –Reverse causation (we’ll table this for now and talk about it in a few weeks) –Confounding variables Need to rule out alternative explanations

15 Bush was particularly unpopular at the end of his presidency… How much did bad feelings about Bush help Obama? Feelings about Obama Feelings about Bush ?

16 Measuring “reverse coattails” effect …I'll read the name of a person and I'd like you to rate that person using something we call the feeling thermometer. Ratings between 50 degrees and 100 degrees mean that you feel favorable and warm toward the person. Ratings between 0 degrees and 50 degrees mean that you don't feel favorable toward the person and that you don't care too much for that person. You would rate the person at the 50 degree mark if you don't feel particularly warm or cold toward the person. Bivariate regression Υ = β 0 + β 1 X + u SO… Obama FT = β 0 + β 1 (Bush FT) + u

17 Obama FT = (-0.43*Bush FT) Coef. SE T P-value Bush FT Constant R-squared = 0.203

18 What else might explain this (strong!) relationship? Other factors that might affect evaluations of both Obama and Bush?

19 Party Identification? Obama Feeling Thermometer Bush Feeling Thermometer Party Identification

20 Generally speaking, do you usually think of yourself as a Democrat, a Republican, an Independent, or what? -3 = Strong Republican -2 = Weak Republican -1 = Lean Republican 0 = Independent 1 = Lean Democrat 2 = Weak Democrat 3 = Strong Democrat

21 Party Identification  FTs Predict Bush Feeling Thermometer Coef. SE T P-value Party Identification Constant Predict Obama Feeling Thermometer Coef. SE T P-value Party Identification Constant

22 Accounting for a confound by splitting the sample… Among Democrats: –Mean evaluation of Bush: 24.7 –Mean evaluation of Obama: 79.2 Among Republicans: –Mean evaluation of Bush: 65.9 –Mean evaluation of Obama: 35.5 Let’s see what happens when we run separate regressions for Democrats and Republicans…

23 Model with all respondents Obama FT = (-0.43*Bush FT)

24 Party ID as Confound Obama Feeling Thermometer (Y) Bush Feeling Thermometer (X) Party Identification (Z) We only want to give Bush FT explanatory “credit” for this part of the relationship Not this part

25 Multivariate Regression Υ = β 0 + β 1 X + β 2 X + u Obama FT = β 0 + β 1 (Bush FT) + β 2 (Party Identification) + u (party identification -3=strong Republican; 3=strong Democrat)

26 Language: relationship between X 1 and Y controlling for X 2 (OR holding X 2 constant) (more precisely: “controlling for the linear relationship between X 2 and Y”) Multivariate Regression Coef. St.Err T P Bush FT Party Identification Constant

27 Party Affiliation Bush Feeling Thermometer Obama Feeling Thermometer Party Affiliation only gets “credit” for this part of the overlap Bush FT only gets “credit” for this part of the overlap No variable gets “credit” for this part, (but it does affect the R-squared) Bivariate regression: Bush FT gets “credit” for all of this overlap

28 Obama FT = β 0 + β 1 (Bush FT) + β 2 (Party Identification) + u Getting predicted values Coef. St.Err T P Bush FT Party Identification Constant

29 Obama FT = (-.165)(Bush FT) (Party Identification) + u What does the coefficient on the constant mean? Expected Value for a Strong Democrat who gave Bush a feeling thermometer rating of 50? Getting predicted values Coef. St.Err T P Bush FT Party Identification Constant

30 Notes and Next Time No Class on Tuesday Remember to look at the homework assignment in time to get TA office hour help before it’s due next Thursday! Next time: –R-squared –Non-continuous explanatory variables –Joint significance of variables (F-tests)


Download ppt "Introduction to Statistics: Political Science (Class 2) Central Limit Theorem, T-statistics, and using split sample analysis and multivariate regression."

Similar presentations


Ads by Google