Presentation is loading. Please wait.

Presentation is loading. Please wait.

What Types Of Data Are Collected? What Kinds Of Question Can Be Asked Of Those Data?  Do people who say they study for more hours also think they’ll.

Similar presentations


Presentation on theme: "What Types Of Data Are Collected? What Kinds Of Question Can Be Asked Of Those Data?  Do people who say they study for more hours also think they’ll."— Presentation transcript:

1

2 What Types Of Data Are Collected? What Kinds Of Question Can Be Asked Of Those Data?  Do people who say they study for more hours also think they’ll finish their doctorate earlier?  Are computer literates less anxious about statistics?  …. ?  Are men more likely to study part-time?  Are women more likely to enroll in CCE?  …. ? Questions that Require Us To Examine Relationships Between Features of the Participants.  How tall are class members, on average?  How many hours a week do class members report that they study?  …. ?  How many members of the class are women?  What proportion of the class is fulltime?  …. ? Questions That Require Us To Describe Single Features of the Participants “Continuous” Data “Categorical” Data Research Is A Partnership Of Questions And Data © Willett, Harvard University Graduate School of Education, 6/14/2016S010Y/C11 – Slide 1 S010Y: Answering Questions with Quantitative Data Class 12/III.4: Conducting Residual Analysis

3 © Willett, Harvard University Graduate School of Education, 6/14/2016S010Y/C11 – Slide 2 S010Y: Answering Questions with Quantitative Data Class 12/III.4: Conducting Residual Analysis OPTIONS Nodate Pageno=1; TITLE1 'A010Y: Answering Questions with Quantitative Data'; TITLE2 'Class 11/Handout 1: Dissecting Relationships Between Continuous Variables'; TITLE3 'The Infamous Wallchart Data'; TITLE4 'Data in WALLCHT.txt'; *--------------------------------------------------------------------------------* Input data, name and label variables in the dataset *--------------------------------------------------------------------------------*; DATA WALLCHT; INFILE 'C:\DATA\A010Y\WALLCHT.txt'; INPUT STATE $ TCHRSAL STRATIO PPEXPEND HSGRADRT; LABEL TCHRSAL = '1988 Average Teacher Salary' STRATIO = '1988 Student/Teacher Ratio' PPEXPEND = '1988 Expenditure/Student' HSGRADRT = '1988 Statewide H.S. Graduation Rate'; *--------------------------------------------------------------------------------* Representing the nature of the relationship of HSGRADRT and STRATIO *--------------------------------------------------------------------------------*; PROC REG DATA=WALLCHT; TITLE5 'OLS Regression of H.S. Graduation Rate on Student/Teacher Ratio'; MODEL HSGRADRT = STRATIO; OUTPUT OUT=DIAGNOSE R=RAWRESID P=PREDVAL; PROC PLOT DATA=WALLCHT; TITLE5 'Plot of H.S. Graduation Rates against Student/Teacher Ratios'; PLOT HSGRADRT*STRATIO / HAXIS = 10 TO 25 BY 5 VAXIS = 50 TO 100 BY 10; Having examined the “smooth” with regression analysis, let’s examine the “rough” with residual analysis … Here are the PC-SAS data input statements that you’ve come to know and love Here’s the OLS regression analysis, using PROC REG, that you’ve seen before (with one additional line that we will discuss later). Standard scatterplot of the HSGRADRT vs. STRATIO relationship

4 © Willett, Harvard University Graduate School of Education, 6/14/2016S010Y/C11 – Slide 3 S010Y: Answering Questions with Quantitative Data Class 12/III.4: Conducting Residual Analysis Dependent Variable: HSGRADRT 1988 Statewide H.S. Graduation Rate Parameter Estimates Parameter Standard Variable Label DF Estimate Error t Value Intercept Intercept 1 93.69187 7.95093 11.78 STRATIO 1988 Student/Teacher Ratio 1 -1.12140 0.45516 -2.46 Parameter Estimates Variable Label DF Pr > |t| Intercept Intercept 1 <.0001 STRATIO 1988 Student/Teacher Ratio 1 0.0174 Dependent Variable: HSGRADRT 1988 Statewide H.S. Graduation Rate Parameter Estimates Parameter Standard Variable Label DF Estimate Error t Value Intercept Intercept 1 93.69187 7.95093 11.78 STRATIO 1988 Student/Teacher Ratio 1 -1.12140 0.45516 -2.46 Parameter Estimates Variable Label DF Pr > |t| Intercept Intercept 1 <.0001 STRATIO 1988 Student/Teacher Ratio 1 0.0174 Here’s the regression output that you’ve seen before, and which specifies the fitted regression line….. These “Parameter Estimates” provide the fitted trend line as the following fitted model: Intercept Slope

5 © Willett, Harvard University Graduate School of Education, 6/14/2016S010Y/C11 – Slide 4 S010Y: Answering Questions with Quantitative Data Class 12/III.4: Conducting Residual Analysis The fitted equation is telling us PROC REG’s best prediction for HSGRADRT at every value of STRATIO. For instance… 1. When STRATIO = 13.3 (the minimum value of STRATIO), Predicted value of HSGRADRT = (93.69) + (-1.12)(13.3) = 93.69 – 14.90 = 78.8 1. When STRATIO = 13.3 (the minimum value of STRATIO), Predicted value of HSGRADRT = (93.69) + (-1.12)(13.3) = 93.69 – 14.90 = 78.8 2. When STRATIO = 24.7 (the maximum value of STRATIO), Predicted value of HSGRADRT = (93.69) + (-1.12)(24.7) = 93.69 – 27.66 = 66.0 2. When STRATIO = 24.7 (the maximum value of STRATIO), Predicted value of HSGRADRT = (93.69) + (-1.12)(24.7) = 93.69 – 27.66 = 66.0 Plot these values to obtain the fitted trend line Here’s the fitted regression model that you recognize …

6 © Willett, Harvard University Graduate School of Education, 6/14/2016S010Y/C11 – Slide 5 This provides us with the “smooth” – where’s the “rough”? … S010Y: Answering Questions with Quantitative Data Class 12/III.4: Conducting Residual Analysis 1 100 ˆ 9 ‚ 8 ‚ ‚ S ‚ t ‚ A a 90 ˆ t ‚ A A e ‚ A w ‚ A A i ‚ A A d ‚ e ‚ 80 ˆ B A A H ‚ A A. ‚ A A A A S ‚ A A A A. ‚ A A AA A A A A ‚ A G ‚ AA A A r 70 ˆ A A a ‚ A A d ‚ A A u ‚ B A a ‚ A t ‚ A i ‚ AB o 60 ˆ n ‚ A ‚ R ‚ a ‚ t ‚ e ‚ 50 ˆ Šƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒ 10 15 20 25 1988 Student/Teacher Ratio 1 100 ˆ 9 ‚ 8 ‚ ‚ S ‚ t ‚ A a 90 ˆ t ‚ A A e ‚ A w ‚ A A i ‚ A A d ‚ e ‚ 80 ˆ B A A H ‚ A A. ‚ A A A A S ‚ A A A A. ‚ A A AA A A A A ‚ A G ‚ AA A A r 70 ˆ A A a ‚ A A d ‚ A A u ‚ B A a ‚ A t ‚ A i ‚ AB o 60 ˆ n ‚ A ‚ R ‚ a ‚ t ‚ e ‚ 50 ˆ Šƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒ 10 15 20 25 1988 Student/Teacher Ratio 66.0 24.7 13.3 78.8 Now, to examine the rough … Let’s pick a few states, and compare our predictions of HS graduation rate to the actual observed values. We call this the “analysis of residuals”… Now, to examine the rough … Let’s pick a few states, and compare our predictions of HS graduation rate to the actual observed values. We call this the “analysis of residuals”…

7 © Willett, Harvard University Graduate School of Education, 6/14/2016S010Y/C11 – Slide 6 Here’s the “rough” for Minnesota … S010Y: Answering Questions with Quantitative Data Class 12/III.4: Conducting Residual Analysis 1 100 ˆ 9 ‚ 8 ‚ ‚ S ‚ t ‚ A a 90 ˆ t ‚ A A e ‚ A w ‚ A A i ‚ A A d ‚ e ‚ 80 ˆ B A A H ‚ A A. ‚ A A A A S ‚ A A A A. ‚ A A AA A A A A ‚ A G ‚ AA A A r 70 ˆ A A a ‚ A A d ‚ A A u ‚ B A a ‚ A t ‚ A i ‚ AB o 60 ˆ n ‚ A ‚ R ‚ a ‚ t ‚ e ‚ 50 ˆ Šƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒ 10 15 20 25 1988 Student/Teacher Ratio 1 100 ˆ 9 ‚ 8 ‚ ‚ S ‚ t ‚ A a 90 ˆ t ‚ A A e ‚ A w ‚ A A i ‚ A A d ‚ e ‚ 80 ˆ B A A H ‚ A A. ‚ A A A A S ‚ A A A A. ‚ A A AA A A A A ‚ A G ‚ AA A A r 70 ˆ A A a ‚ A A d ‚ A A u ‚ B A a ‚ A t ‚ A i ‚ AB o 60 ˆ n ‚ A ‚ R ‚ a ‚ t ‚ e ‚ 50 ˆ Šƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒ 10 15 20 25 1988 Student/Teacher Ratio How about Minnesota? Observed values of the outcome and the predictor: STRATIO = 17.1 HSGRADRT = 90.9, & Predicted value of HSGRADRT, obtained from fitted regression line: How about Minnesota? Observed values of the outcome and the predictor: STRATIO = 17.1 HSGRADRT = 90.9, & Predicted value of HSGRADRT, obtained from fitted regression line: 17.1 90.9 74.4 Minnesota graduated a higher percentage of its High-School Seniors than we would have predicted, given its student/teacher ratio.

8 © Willett, Harvard University Graduate School of Education, 6/14/2016S010Y/C11 – Slide 7 Here’s the “rough” for Hawaii … S010Y: Answering Questions with Quantitative Data Class 12/III.4: Conducting Residual Analysis 1 100 ˆ 9 ‚ 8 ‚ ‚ S ‚ t ‚ A a 90 ˆ t ‚ A A e ‚ A w ‚ A A i ‚ A A d ‚ e ‚ 80 ˆ B A A H ‚ A A. ‚ A A A A S ‚ A A A A. ‚ A A AA A A A A ‚ A G ‚ AA A A r 70 ˆ A A a ‚ A A d ‚ A A u ‚ B A a ‚ A t ‚ A i ‚ AB o 60 ˆ n ‚ A ‚ R ‚ a ‚ t ‚ e ‚ 50 ˆ Šƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒ 10 15 20 25 1988 Student/Teacher Ratio 1 100 ˆ 9 ‚ 8 ‚ ‚ S ‚ t ‚ A a 90 ˆ t ‚ A A e ‚ A w ‚ A A i ‚ A A d ‚ e ‚ 80 ˆ B A A H ‚ A A. ‚ A A A A S ‚ A A A A. ‚ A A AA A A A A ‚ A G ‚ AA A A r 70 ˆ A A a ‚ A A d ‚ A A u ‚ B A a ‚ A t ‚ A i ‚ AB o 60 ˆ n ‚ A ‚ R ‚ a ‚ t ‚ e ‚ 50 ˆ Šƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒ 10 15 20 25 1988 Student/Teacher Ratio How about Hawaii? Observed values of the outcome and the predictor: HSGRADRT = 69.1, & STRATIO = 21.6 Predicted value of HSGRADRT: How about Hawaii? Observed values of the outcome and the predictor: HSGRADRT = 69.1, & STRATIO = 21.6 Predicted value of HSGRADRT: Hawaii graduated about the percentage of its High-School Seniors that we would have predicted, given its student/teacher ratio. 21.6 69.1 69.5

9 © Willett, Harvard University Graduate School of Education, 6/14/2016S010Y/C11 – Slide 8 Here’s the “rough” for Minnesota … S010Y: Answering Questions with Quantitative Data Class 12/III.4: Conducting Residual Analysis 1 100 ˆ 9 ‚ 8 ‚ ‚ S ‚ t ‚ A a 90 ˆ t ‚ A A e ‚ A w ‚ A A i ‚ A A d ‚ e ‚ 80 ˆ B A A H ‚ A A. ‚ A A A A S ‚ A A A A. ‚ A A AA A A A A ‚ A G ‚ AA A A r 70 ˆ A A a ‚ A A d ‚ A A u ‚ B A a ‚ A t ‚ A i ‚ AB o 60 ˆ n ‚ A ‚ R ‚ a ‚ t ‚ e ‚ 50 ˆ Šƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒ 10 15 20 25 1988 Student/Teacher Ratio 1 100 ˆ 9 ‚ 8 ‚ ‚ S ‚ t ‚ A a 90 ˆ t ‚ A A e ‚ A w ‚ A A i ‚ A A d ‚ e ‚ 80 ˆ B A A H ‚ A A. ‚ A A A A S ‚ A A A A. ‚ A A AA A A A A ‚ A G ‚ AA A A r 70 ˆ A A a ‚ A A d ‚ A A u ‚ B A a ‚ A t ‚ A i ‚ AB o 60 ˆ n ‚ A ‚ R ‚ a ‚ t ‚ e ‚ 50 ˆ Šƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒ 10 15 20 25 1988 Student/Teacher Ratio How about New York State? Observed values of the outcome and the predictor: HSGRADRT = 62.3, & STRATIO = 15.2 Predicted value of HSGRADRT: How about New York State? Observed values of the outcome and the predictor: HSGRADRT = 62.3, & STRATIO = 15.2 Predicted value of HSGRADRT: New York State graduated a much smaller percentage of its High-School seniors than we would have predicted, based on its student/teacher ratio. 15.2 62.3 76.7

10 © Willett, Harvard University Graduate School of Education, 6/14/2016S010Y/C11 – Slide 9 S010Y: Answering Questions with Quantitative Data Class 12/III.4: Conducting Residual Analysis On a scatterplot with a fitted regression line, the “vertical distance” between the observed value of HSGRADRT and its predicted value is called the residual….. State Residual Computation Conclusion: State graduated HS seniors at a rate that is… Minnesota(90.90 – 74.54) = 16.36… better than predicted, based on STRATIO Hawaii(69.10 – 69.50) = -0.40… about as predicted, based on STRATIO New York State(62.30 – 76.67) = -14.37… worse than predicted, based on STRATIO etc. Residuals can be informative and useful:  Residuals represent individual deviations from the average trend:  They tell us about HSGRADRT, while taking “into account” or “controlling for” STRATIO. They tell us whether states are doing “better” or “worse” than we would have predicted, given our knowledge of their student/teacher ratio. Residuals can be informative and useful:  Residuals represent individual deviations from the average trend:  They tell us about HSGRADRT, while taking “into account” or “controlling for” STRATIO. They tell us whether states are doing “better” or “worse” than we would have predicted, given our knowledge of their student/teacher ratio.

11 © Willett, Harvard University Graduate School of Education, 6/14/2016S010Y/C11 – Slide 10 S010Y: Answering Questions with Quantitative Data Class 12/III.4: Conducting Residual Analysis > *------------------------------------------------------------------------* Representing the nature of the relationship of HSGRADRT and STRATIO *------------------------------------------------------------------------*; PROC REG DATA=WALLCHT; TITLE5 'OLS Regression of H.S. Graduation Rate on Student/Teacher Ratio'; MODEL HSGRADRT = STRATIO; OUTPUT OUT=DIAGNOSE R=RAWRESID P=PREDVAL; We don’t have to compute the residuals and predicted values by hand…. You can ask PC-SAS to compute the residuals for you, and to output them into a diagnostic dataset, for you to explore. You can ask PC-SAS to compute the residuals for you, and to output them into a diagnostic dataset, for you to explore. P = PREDVAL P command tells PC-SAS that you also want to put the predicted values into the new output dataset, and call them PREDVAL. P = PREDVAL P command tells PC-SAS that you also want to put the predicted values into the new output dataset, and call them PREDVAL. R = RAWRESID R command tells PC-SAS that you want to put “raw residuals” into the new output dataset, and call them RAWRESID R = RAWRESID R command tells PC-SAS that you want to put “raw residuals” into the new output dataset, and call them RAWRESID OUT = DIAGNOSE OUT command tells PC- SAS that you want to create an OUTput dataset called DIAGNOSE. OUT = DIAGNOSE OUT command tells PC- SAS that you want to create an OUTput dataset called DIAGNOSE.

12 © Willett, Harvard University Graduate School of Education, 6/14/2016S010Y/C11 – Slide 11 S010Y: Answering Questions with Quantitative Data Class 12/III.4: Conducting Residual Analysis *-------------------------------------------------------------------------------* Examining the distribution of the raw residuals *-------------------------------------------------------------------------------*; PROC UNIVARIATE PLOT DATA=DIAGNOSE; TITLE5 'Univariate descriptive statistics on the Raw Residuals'; VAR RAWRESID; ID STATE; PROC PLOT DATA=DIAGNOSE; TITLE5 'Plot of the Raw Residuals against the Values of the Predictor, STRATIO'; PLOT RAWRESID*STRATIO / HAXIS = 10 TO 25 BY 10 VREF = 0; *-------------------------------------------------------------------------------* Reranking the States based on the value of their raw residuals *-------------------------------------------------------------------------------*; PROC SORT DATA=DIAGNOSE; BY DESCENDING RAWRESID; PROC PRINT LABEL DATA=DIAGNOSE; TITLE5 'Listing of State Observed, Predicted and Residual Graduation Rates'; VAR STATE HSGRADRT PREDVAL RAWRESID; Once the residuals and predicted values are output to the DIAGNOSE dataset, you can take a look…. You can use PROC UNIVARIATE to explore the sample distribution of the raw residuals across the states. You can use PROC PLOT to plot the raw residuals against the predictor. You can use PROC SORT to sort the states by the value of their raw residual, and then use PROC PRINT to list them all out for inspection, along with the name of the state, and the observed and predicted values of HSGRADRT

13 © Willett, Harvard University Graduate School of Education, 6/14/2016S010Y/C11 – Slide 12 S010Y: Answering Questions with Quantitative Data Class 12/III.4: Conducting Residual Analysis Here are some of the univariate descriptive statistics on the residuals…. Variable: RAWRESID (Residual) N 50 Sum Weights 50 Mean 0 Sum Observations 0 Std Deviation 7.38040638 Variance 54.4703983 Basic Statistical Measures Location Variability Mean 0.00000 Std Deviation 7.38041 Median -0.27000 Variance 54.47040 Mode. Range 32.56358 Interquartile Range 8.69773 Quantile Estimate 100% Max 16.384021 95% 12.101925 75% Q3 4.760352 50% Median -0.269997 25% Q1 -3.937376 5% -11.733883 0% Min -16.179560 Extreme Observations -----------Lowest----------- -----------Highest---------- Value STATE Obs Value STATE Obs -16.1796 FL 9 10.8684 WY 50 -14.3466 NY 32 11.3262 MT 26 -11.7339 AZ 3 12.1019 ND 34 -11.7217 GA 10 13.4066 UT 44 -11.5460 LA 18 16.3840 MN 23 Sample mean of the raw residuals is exactly zero! Sample standard deviation of the raw residuals is 7.4. This number can be quite useful! Listing of “extreme observations” is useful for identifying states whose observed values of HSGRADRT are wildly different from their predicted values

14 © Willett, Harvard University Graduate School of Education, 6/14/2016S010Y/C11 – Slide 13 S010Y: Answering Questions with Quantitative Data Class 12/III.4: Conducting Residual Analysis Here’s the stem.leaf and boxplot of the residual… Stem Leaf # Boxplot 16 4 1 | 14 | 12 14 2 | 10 93 2 | 8 646 3 | 6 111 3 | 4 89 2 +-----+ 2 2779938 7 | | 0 6722 4 | + | -0 6566442 7 *-----* -2 9870641 7 +-----+ -4 73 2 | -6 16 2 | -8 808 3 | -10 775 3 | -12 | -14 3 1 | -16 2 1 | ----+----+----+-- Actually, for the p-values that were computed in the regression analysis to be correct, the residuals must be normally distributed:  You can use stem.leaf and box plots to check roughly if this assumption holds in your analysis … see S-030. Actually, for the p-values that were computed in the regression analysis to be correct, the residuals must be normally distributed:  You can use stem.leaf and box plots to check roughly if this assumption holds in your analysis … see S-030.

15 © Willett, Harvard University Graduate School of Education, 6/14/2016S010Y/C11 – Slide 14 S010Y: Answering Questions with Quantitative Data Class 12/III.4: Conducting Residual Analysis H.S. Predicted Graduation Value of STATE Rate HSGRADRT Residual MN 90.9 74.5160 16.3840 UT 79.4 65.9934 13.4066 ND 88.3 76.1981 12.1019 MT 87.3 75.9738 11.3262 WY 88.3 77.4316 10.8684 IA 85.8 76.1981 9.6019 WI 84.9 75.5252 9.3748 NE 85.4 76.7588 8.6412 CT 84.9 78.7773 6.1227 OH 79.6 73.5067 6.0933 WA 77.1 71.0396 6.0604 ID 75.4 70.4789 4.9211 NV 75.8 71.0396 4.7604 KS 80.2 76.4224 3.7776 SD 79.6 76.3102 3.2898 PE 78.4 75.5252 2.8748 AL 74.9 72.0489 2.8511 AR 77.2 74.5160 2.6840 IN 76.3 73.6189 2.6811 MI 73.6 71.3761 2.2239 IL 75.6 74.4038 1.1962 CO 74.7 73.5067 1.1933 WV 77.3 76.6466 0.6534 VT 78.7 78.1044 0.5956 OR 73.0 73.1703 -0.1703 HI 69.1 69.4697 -0.3697 MD 74.1 74.5160 -0.4160 NJ 77.4 77.9923 -0.5923 NM 71.9 72.4975 -0.5975 MO 74.0 75.5252 -1.5252 NH 74.1 75.7495 -1.6495 CA 65.9 68.0119 -2.1119 TN 69.3 71.7125 -2.4125 ME 74.4 76.9831 -2.5831 OK 71.7 74.7403 -3.0403 MA 74.4 78.1044 -3.7044 VA 71.6 75.4131 -3.8131 DL 71.7 75.6374 -3.9374 KY 69.0 73.2824 -4.2824 MS 66.9 72.6096 -5.7096 NC 66.7 73.2824 -6.5824 RI 69.8 76.8709 -7.0709 AK 65.5 74.2917 -8.7917 TX 65.3 74.2917 -8.9917 SC 64.6 74.4038 -9.8038 LA 61.4 72.9460 -11.5460 GA 61.0 72.7217 -11.7217 AZ 61.1 72.8339 -11.7339 NY 62.3 76.6466 -14.3466 FL 58.0 74.1796 -16.1796 +1 sd +2 sd -1 sd -2 sd Here are the individual states, ordered by their residuals … Which are the truly extraordinary states?  If the residuals are normally distributed, then the truly extraordinary states may be those that lie ±2 standard deviations (= ± 2×7.4) from the mean?  Recall that the mean of the residuals is zero. Which are the truly extraordinary states?  If the residuals are normally distributed, then the truly extraordinary states may be those that lie ±2 standard deviations (= ± 2×7.4) from the mean?  Recall that the mean of the residuals is zero.

16 © Willett, Harvard University Graduate School of Education, 6/14/2016S010Y/C11 – Slide 15 S010Y: Answering Questions with Quantitative Data Class 12/III.4: Conducting Residual Analysis 1 100 ˆ 9 ‚ 8 ‚ ‚ S ‚ t ‚ A a 90 ˆ t ‚ A A e ‚ A w ‚ A A i ‚ A A d ‚ e ‚ 80 ˆ B A A H ‚ A A. ‚ A A A A S ‚ A A A A. ‚ A A AA A A A A ‚ A G ‚ AA A A r 70 ˆ A A a ‚ A A d ‚ A A u ‚ B A a ‚ A t ‚ A i ‚ AB o 60 ˆ n ‚ A ‚ R ‚ a ‚ t ‚ e ‚ 50 ˆ Šƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒ 10 15 20 25 1988 Student/Teacher Ratio 1 100 ˆ 9 ‚ 8 ‚ ‚ S ‚ t ‚ A a 90 ˆ t ‚ A A e ‚ A w ‚ A A i ‚ A A d ‚ e ‚ 80 ˆ B A A H ‚ A A. ‚ A A A A S ‚ A A A A. ‚ A A AA A A A A ‚ A G ‚ AA A A r 70 ˆ A A a ‚ A A d ‚ A A u ‚ B A a ‚ A t ‚ A i ‚ AB o 60 ˆ n ‚ A ‚ R ‚ a ‚ t ‚ e ‚ 50 ˆ Šƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒ 10 15 20 25 1988 Student/Teacher Ratio An Enhanced Conclusion… In our investigation of state-level aggregate statistics, the average percentage of seniors graduating from High School is related to the average student/teacher ratio in the state. With state-wide high-school graduation rate (HSGRADRT) as outcome and state-wide student/teacher ratio (STRATIO) as predictor, the trend-line estimated by OLS regression analysis has a slope of –1.12 (p = 0.0174). This suggests that two states whose student/teacher ratios differ by 1 student per teacher will tend to have graduation rates that differ by 1.12 percentage points, where states that enjoy lower student/teacher ratios having higher high-school graduation rates … > However, not all states follow the average trend. Some states graduate high-school seniors at rates considerably different from those predicted from knowledge of their student/teacher ratios. In particular, Minnesota has a very large positive residual indicating that its high-school graduation rate is much higher than we would expect, based on its student/teacher ratio. Florida, on the other hand, has a very large negative residual indicating that it is graduating high-school seniors at a rate that is much lower than we would anticipate … > An Enhanced Conclusion… In our investigation of state-level aggregate statistics, the average percentage of seniors graduating from High School is related to the average student/teacher ratio in the state. With state-wide high-school graduation rate (HSGRADRT) as outcome and state-wide student/teacher ratio (STRATIO) as predictor, the trend-line estimated by OLS regression analysis has a slope of –1.12 (p = 0.0174). This suggests that two states whose student/teacher ratios differ by 1 student per teacher will tend to have graduation rates that differ by 1.12 percentage points, where states that enjoy lower student/teacher ratios having higher high-school graduation rates … > However, not all states follow the average trend. Some states graduate high-school seniors at rates considerably different from those predicted from knowledge of their student/teacher ratios. In particular, Minnesota has a very large positive residual indicating that its high-school graduation rate is much higher than we would expect, based on its student/teacher ratio. Florida, on the other hand, has a very large negative residual indicating that it is graduating high-school seniors at a rate that is much lower than we would anticipate … >


Download ppt "What Types Of Data Are Collected? What Kinds Of Question Can Be Asked Of Those Data?  Do people who say they study for more hours also think they’ll."

Similar presentations


Ads by Google