Presentation is loading. Please wait.

Presentation is loading. Please wait.

Building the NCSC Summative Assessment: Towards a Stage- Adaptive Design Sarah Hagge, Ph.D., and Anne Davidson, Ed.D. McGraw-Hill Education CTB CCSSO New.

Similar presentations


Presentation on theme: "Building the NCSC Summative Assessment: Towards a Stage- Adaptive Design Sarah Hagge, Ph.D., and Anne Davidson, Ed.D. McGraw-Hill Education CTB CCSSO New."— Presentation transcript:

1 Building the NCSC Summative Assessment: Towards a Stage- Adaptive Design Sarah Hagge, Ph.D., and Anne Davidson, Ed.D. McGraw-Hill Education CTB CCSSO New Orleans, LA June 25, 2014 Copyright © 2014 CTB/McGraw-Hill LLC. 1

2 Overview  Rationale for stage-adaptive test  Proposed stage-adaptive design  Overview of pilot testing: Plan and goals  Summary of results from Pilot Phase I  Main findings and next steps Copyright © 2014 CTB/McGraw-Hill LLC. 2

3 Rationale for Stage-Adaptive Test  Targeted to student proficiency levels  Improved precision of student test scores  Reduced total testing time  Reduced testing burden to students and teacher test administrators Copyright © 2014 CTB/McGraw-Hill LLC. 3

4 Proposed Stage-Adaptive Design  All students will receive tests with the same content distribution  Tests will be adaptive based on tiers and item difficulty –All students receive the same or a similar first stage, or testlet of items –Students will receive a second stage of items of lower, higher or about the same difficulty based on their performance on the first stage of the test Copyright © 2014 CTB/McGraw-Hill LLC. 4

5 Example of a Stage-Adaptive Design Copyright © 2014 CTB/McGraw-Hill LLC. 5 Stage 1 Moderate difficulty All students Stage 2B Higher difficulty Higher performing students from Stage 1 Stage 2A Lower difficulty Lower performing students from Stage 1

6 Overview of Pilot Testing

7 Purpose of Pilot Testing  Collect information necessary to support development and refinement of NCSC summative assessment design  Pilot Phase 1 – Item tryout – Spring 2014 –Generate student performance data –Investigate administration conditions –Understand how the items are functioning –Investigate the proposed item scoring processes and procedures  Pilot Phase 2 – Test forms – Fall 2014 –Investigate the adaptive algorithm –Collect form and student performance data Copyright © 2013 CTB/McGraw-Hill LLC. 7

8 Broad goals  Try out items  Evaluate items  Understand administration policies  Understand administration processes –Computer based system –Accommodations  Investigate building an IRT scale  Develop the stage adaptive design specification Copyright © 2014 CTB/McGraw-Hill LLC. 8

9 ELA Content and Forms  Grades 3-8, 11  8 forms/grade –Four reading passages Two literary and two informational Foundational items in Grades 3 and 4 –22 – 35 items/form –One passage at each of the four tiers –Selected response and dichotomously scored constructed response items Copyright © 2014 CTB/McGraw-Hill LLC. 9

10 Math Content and Forms  Grades 3-8, 11  8 forms/grade –25 items per form –Each form contained a mix of all four item tiers –Content distribution percentages similar across the 8 forms –Selected response and dichotomously scored constructed response items Copyright © 2014 CTB/McGraw-Hill LLC. 10

11 Initial Analysis  Demographic characteristics of student sample –Descriptive statistics (e.g., gender, ethnicity) were collected for the sample of students. –Learner characteristic inventory was used to collect profile information about students who participated. –Accommodations data was collected prior to administration as well as whether the eligible student used the accommodation.  Form-level results  Classical item analysis  Tier analysis  Item response time Copyright © 2014 CTB/McGraw-Hill LLC. 11

12 Flagging Criteria for Item Reviews  Classical Item Analysis –Low p-value, <0.50 (note Tier-1 items have 2 answer choices) –High p-value, >0.90 –Low point-biserial correlation, < 0.20 –High option point-biserial correlation, >0.05 –Omit rate, >5%  Tier reversals (Tier 1 p-value < Tier 4)  Key checks (Distractor analysis)  Survey and student interaction study results Copyright © 2014 CTB/McGraw-Hill LLC. 12

13 Pilot Phase I Results

14 Summary of Student Counts  3832 students overall took ELA (n forms = 8/grade)  3703 students overall took Math (n forms = 8/grade) Copyright © 2014 CTB/McGraw-Hill LLC. 14

15 Summary of Descriptive Statistics Subgroup CategoryN% Gender* Male332964.8 Female181135.2 Ethnicity** White269052.1 Asian 1593.1 Hawaiian or Pacific Islander881.7 Indian or Alaska Native 2054.0 Hispanic 129625.1 African American69713.5 Copyright © 2014 CTB/McGraw-Hill LLC. 15

16 Summary of Accommodations SubgroupCategoryN% Assistive Presentation Needs2785.4 Used1072.1 Assistive Response Needs4578.9 Used1913.7 Braille Form Needs** Used** Large Print Form Needs2294.4 Used821.6 Paper Version Needs5129.9 Used3496.8 Read or Reread Needs447186.6 Used293056.8 Copyright © 2014 CTB/McGraw-Hill LLC. 16 SubgroupCategoryN% Text to Speech Needs126324.5 Used58211.3 Scribe Needs110321.4 Used4468.6 Speech to Text Needs3386.5 Used861.7 Sign Interpretation Needs981.9 Used400.8 No Accommodation Needed Needs206940.1 Used142927.7

17 ELA Form-Level Results Copyright © 2014 CTB/McGraw-Hill LLC. 17 Note. * Forms included all ELA items except the extended Writing prompt. Note. Cronbach alpha coefficients ranged from 0.56 to 0.90 on ELA forms.

18 Math Form-Level Results Copyright © 2014 CTB/McGraw-Hill LLC. 18 Note. Cronbach alpha coefficients ranged from 0.31 to 0.83 on math forms.

19 Classical Item Results  Range of item p-values –0.05 to 0.95 –P-value standard deviation of 0.11 to 0.23 depending on test form –Very few items with low or high p-values  Item omit rates less than 3% across all items  Majority of flagged items had low point-biserial or high option point-biserial Copyright © 2014 CTB/McGraw-Hill LLC. 19

20 Tier results: Mean p-values Copyright © 2014 CTB/McGraw-Hill LLC. 20

21 Discussion and Next Steps

22 Main Findings –Evidence that content is appropriate for students in the Phase I Pilot sample. Range of p-values Relatively few items flagged for high or low p-values Item omit rates and not-reached rates 3% or less Form percent correct range of approximately 45-70% –Evidence that tiers are functioning according to design at an aggregate level Tier 1 easier than the other four tiers Tiers 2, 3 and 4 tended to have a pattern of difficulty ranging from least to most difficult –Evidence that item bank can support forms at different difficulty levels Items exhibit a range of p-values Copyright © 2014 CTB/McGraw-Hill LLC. 22

23 Next Steps –Investigate IRT scaling on forms with higher N counts –Conduct item and form-level analysis by student subgroups –Conduct simulation studies of the adaptive design –Pilot Phase 2 Field-test items to obtain statistics for operational item bank Evaluate stage-adaptive design Copyright © 2014 CTB/McGraw-Hill LLC. 23

24 Thank you! Copyright © 2014 CTB/McGraw-Hill LLC. 24


Download ppt "Building the NCSC Summative Assessment: Towards a Stage- Adaptive Design Sarah Hagge, Ph.D., and Anne Davidson, Ed.D. McGraw-Hill Education CTB CCSSO New."

Similar presentations


Ads by Google