Presentation is loading. Please wait.

Presentation is loading. Please wait.

Using evidence-based decision trees instead of formulas to identify at-risk readers Sharon Koon Yaacov Petscher Barbara R. Foorman Florida Center For Reading.

Similar presentations


Presentation on theme: "Using evidence-based decision trees instead of formulas to identify at-risk readers Sharon Koon Yaacov Petscher Barbara R. Foorman Florida Center For Reading."— Presentation transcript:

1 Using evidence-based decision trees instead of formulas to identify at-risk readers Sharon Koon Yaacov Petscher Barbara R. Foorman Florida Center For Reading Research

2 This information is being provided as part of a presentation administered by the Regional Educational Laboratory Southeast. Information and materials mentioned or shown during this presentation are provided as resources and examples for the viewer's convenience. Their inclusion is not intended as an endorsement by the Regional Educational Laboratory Southeast or its funding source, the Institute of Education Sciences (Contract ED-IES-12-C-0011). In addition, the instructional practices and assessments discussed or shown in these presentations are not intended to mandate, direct, or control a State’s, local educational agency’s, or school’s specific instructional content, academic achievement system and assessments, curriculum, or program of instruction. State and local programs may use any instructional content, achievement system and assessments, curriculum, or program of instruction they wish. 2

3 REL Southeast http://ies.ed.gov/ncee/edlabs/projects/

4 Overview Educators need to understand how students are identified as at risk for reading problems. Early warning systems (EWS) — or diagnostic systems — can be used for this purpose. EWS provide opportunities for interventions to occur that may prevent an anticipated negative outcome (e.g. retention, dropping out). Methods for developing EWS include logistic regression (LR) and classification and regression tree (CART) analysis. While the CART model is an emerging tool in the field of education, limited research exists on its comparability to LR when using multivariate assessments of reading to screen for reading difficulties.

5 Research question This study used data from a sample of students in grades 1 and 2 in Florida public schools during the 2012/13 academic year to answer the following research question: How do CART models compare with logistic regression models in predicting poor performance on the reading comprehension subtest of the Stanford Achievement Test?

6 Logistic Regression (LR)

7 LR (continued) Log-odds for the likelihood of achieving one of the two categories of the selected outcome (e.g., pass/fail a test) can be transformed into probabilities. When there are one or two predictors, contingency tables present a simple way to display the classification rules based on probabilities. Test 1 Test 2242243244245246 348 0.15 0.300.600.80 349 0.15 0.300.600.80 350 0.30 0.600.80 351 0.60 0.80 352 0.80

8 Sample 2 x 2 classification table Screening Assessment Outcome Assessment FailPass At riskA: True positiveB: False positive Not at riskC: False negativeD: True negative LR (continued) Sensitivity (the proportion of true positives; A/(A+C)]. Specificity (the proportion of true negatives; D/(D+B). Positive predictive power [the proportion of students who are identified as at-risk on the screening assessment and fail the outcome assessment; A/(A+B)] Negative predictive power [the proportion of students identified as not at-risk on the screening assessment and pass the outcome assessment; D/(C+D)]. Measures of classification accuracy

9 LR (continued) Primary Advantages Widely accepted approach to predicting future performance on similar samples of students. Offers tests of statistical significance of the predictors. Primary Disadvantage Regression equations are often difficult for practitioners to understand and apply.

10 Classification and Regression Tree (CART) CART uses a set of “if-then” clauses, in the form of a decision tree, to classify individuals, which can easily be applied by practitioners. Example: For two measures which both have score ranges of 100-1000, it may be observed that students who score less than 244 on Test 1 and less than 350 on test 2 are identified as at-risk.

11 CART (continued) The CART analysis searches for the optimal split on the predictor variables which best separates the sample into mutually exclusive subsamples called nodes. A variable can appear many times. The nodes form either a rectangular box (terminal node) or a circle (nonterminal node). Circles split again when there is a difference between students on the predictor variables or a stopping rule has not been reached (e.g., minimum split size).

12 CART (continued) Model specifications: Minimum split—specifies the minimum number of cases that must exist in a node in order for a split to be attempted. Minimum complexity parameter (cp)—specifies the minimum decrease in the overall lack of fit that must result by an additional split. Fit is measured by the model’s relative error, which is equivalent to 1-R 2 (Steinberg, 2013). Loss matrix—is used to weight classification errors differently. To increase the negative predictive power, false negatives would be specified as more costly. The default is equal weight.

13 CART (continued) CART models are predominantly found in medical applications (i.e., diagnosis and prognosis of illnesses) because of the method’s suitability in generating clinical decision rules. While the CART model is more frequently found in medical literature, it has begun to emerge in educational research.

14 CART (continued) Primary Advantages Operates on the original test scores’ scale (instead of odds-, log-odds or predicted probabilities). Results can be communicated using decision trees. Not sensitive to the presence of the outliers in the data or collinearity between variables. Models complex interactions among predictors which may be difficult to detect in the regression framework. Primary Disadvantage Complex trees can be difficult to interpret.

15 Data Large school district in Florida Students were tested on both the Florida Assessment for Instruction in Reading – Florida Standards (FAIR-FS) and Stanford Achievement Test Series, Tenth Edition (SAT-10) in 2012/13 as a part of a linking study conducted under a Reading for Understanding assessment grant. 986 students in grade 1 and 887 students in grade 2 from this archival dataset were included in this analysis.

16 Measures: Predictors FAIR-FS : – word reading (grades 1 and 2) – word building (grade 1) – spelling (grade 2) – vocabulary pairs (grades 1 and 2) – following directions (grades 1 and 2) Developmental scale scores range from 200 to 800, with a mean of 500 and a standard deviation of 100.

17 Measures: Outcome SAT-10 – reading comprehension subtest – Percentile scores on the SAT-10 were dichotomized so that scores at or above the 40th percentile were coded as 1 for “not at risk” and scores below the 40th percentile were coded as 0 for “at risk.”

18 Preliminary steps Deletion of univariate and multivariate outliers Data imputation (less than 10% missing in each grade) – Mean of 20,000 imputed values was used (CART needed one imputed file) Grade file split into calibration (80%) and validation (20%) data sets. – Calibration = “training” data set – Validation = “testing” data set

19 Grade 1 correlations Grade 1 correlations between measures (n=780) 12345 1SAT-101 2Following Directions.41 ** 1 3Vocabulary Pairs.52 **.51 ** 1 4Word Reading.64 **.36 **.53 ** 1 5Word Building.53 **.45 **.51 **.61 ** 1 **. Correlation is significant at the 0.01 level (2-tailed).

20 Grade 2 correlations Grade 2 correlations between measures (n=706) 12345 1SAT-101 2Following Directions.40 ** 1 3Spelling.60 **.33 ** 1 4Vocabulary Pairs.48 **.40 **.44 ** 1 5Word Reading.62 **.36 **.78 **.46 ** 1 **. Correlation is significant at the 0.01 level (2-tailed).

21 LR steps Based on the correlations between the individual FAIR-FS tests and the dichotomized SAT-10 variable, the FAIR-FS test scores were entered into the logistic regression ordered by correlational magnitude. FAIR-FS tests which added at least 2% unique variance above the test already in the model, as measured by the Nagelkerke pseudo R-squared, were retained for the final classification model. Analyses conducted using SPSS Statistics 21

22 CART steps CART models assessed the individual performance of each FAIR-FS test, at every available cut point, in classifying students into risk and no-risk categories. Several specifications were used to limit the number of splits, including: – minimal split size = 3 – complexity parameter (model specific; identified using cross validation statistics; grade 1 =.02; grade 2 =.016). The intention was to build and prune trees based on maximizing the negative predictive power of.85. To accomplish this, revisions to the model in grade 2 included the specification of a loss matrix. Analyses conducted using R rpart package

23 CART steps (cp values) Grade 1 Grade 2 A recommended minimum standard is the value of the complexity parameter which results in a cross validation relative error less than one standard error above the minimum cross validation relative error (Therneau, Atkinson, & Ripley, 2013).

24 Logistic Regression Results Logit = -21.749 + 0.032*(word reading score) + 0.006*(word building score) + 0.009*(vocabulary pairs score) (following directions was dropped from the model, ΔR 2 <.02) Nagelkerke pseudo R 2 =.72 Logits are then transformed to probabilities and then probabilities are used to establish risk classification (e.g., <.5 may be considered at risk). Grade 1 Variable importance: word reading, vocabulary pairs, word building, following directions R 2 =.64 CART Results

25 Logistic Regression Results Logit = -21.984 + 0.017*(word reading score) + 0.011*(spelling score) + 0.007*(vocabulary pairs score) + 0.007*(following directions score) Nagelkerke pseudo R 2 =.70 Logits are then transformed to probabilities and then probabilities are used to establish risk classification (e.g., <.5 may be considered at risk) Grade 2 Variable importance: word reading, spelling, vocabulary pairs, following directions. R 2 =.71 CART Results

26 Summary by model Validation results (n=206) ModelSensitivitySpecificity Positive Predictive Power Negative Predictive Power Overall Proportion Correct Grade 1 CART.92.90.79.96.90 Logistic regression.87.90.78.94.89 Grade 2 CART.82.86.73.92.85 Logistic regression.84.88.76.92.87

27 Implications The CART results were found to be comparable to those of logistic regression, with the results of both methods yielding negative predictive power greater than the recommended standard of.90. Given the comparability, the CART model may be more appealing in an educational context due to the ease in which the results can be communicated to practitioners.

28 Limitations The specifications used in this study were designed to meet or exceed negative predictive power of 0.85, while maintaining acceptable levels of sensitivity and specificity. Using a different measure of classification accuracy may affect the results in this report by favoring one method over the other.

29 Future research The use of CART models in EWS could be studied to determine whether school staff find them easier to use than logistic regression. As noted in the literature review, both CART and LR may be used in a complimentary approach, since each method presents both advantages and disadvantages and provides different tools to the researcher.

30 Thank you! E-mail: skoon@fcrr.org Website: rel-se.fsu.edu


Download ppt "Using evidence-based decision trees instead of formulas to identify at-risk readers Sharon Koon Yaacov Petscher Barbara R. Foorman Florida Center For Reading."

Similar presentations


Ads by Google