The use of asynchronously scored items in adaptive test sessions. Marty McCall Smarter Balanced Assessment Consortium 1 2015 CCSSO NCSA San Diego CA.

The use of asynchronously scored items in adaptive test sessions. Marty McCall Smarter Balanced Assessment Consortium 1 2015 CCSSO NCSA San Diego CA

Assessments produce evidence of student performance on challenging tasks that evaluate the Common Core State Standards (CCSS)..... They emphasize deep knowledge of core concepts and ideas within and across the disciplines—along with analysis, synthesis, problem solving, communication, and critical thinking—thereby requiring a focus on complex performances as well as on specific concepts, facts, and skills (Smarter Balanced (2010). Theory of Action. p.1). Smarter Balanced Theory of Action 2

Nature of tasks under CCSS and ECD Influence of cognitive psychology – Interest in mental processes and models – Purpose of task is to provide evidence confirming or refuting hypotheses about what a student knows – Evidence Centered Design – Tasks exemplify desired learning Integration of skills into complex tasks – Core foundational concepts building across years – Tasks demand use of several skills and concepts – Emphasis on communication and insight 3

4 6 Key Components of Evidence-Centered Design 6. Develop Items or Performance Tasks 1. Define the domain 2. Define claims to be made 3. Define assessment targets 4. Define evidence required 5. Develop Task Models

Overall Claim – Reading – Writing – Speaking and Listening – Research/Inquiry Claims provide overall test and reporting structure –Targets are nested within claims Approved English Language Arts Claims 5

Test design –Assessments have an adaptive and a non-adaptive component C1: Reading Literary Informational C1: Reading Literary Informational C3: Listening C4: Research Adaptive session PT C2: Writing 6

7 CENTRAL IDEAS REASONING & EVIDENCE KEY DETAILS WORD MEANINGS REASONING & EVIDENCE ANALYSIS WITHIN OR ACROSS TEXTS TEXT STRUCTURES & FEATURES LANGUAGE USE Reading Targets – info and lit

8 WRITE/REVISE BRIEF TEXTS - Narrative strategies WRITE/REVISE BRIEF TEXTS - Organizing ideas WRITE/REVISE BRIEF TEXTS - provide support for opinions COMPOSE FULL TEXTS EDIT/CLARIFY Writing Targets

9 WRITE BRIEF TEXTS - Narrative strategies WRITE BRIEF TEXTS - Organizing ideas WRITE BRIEF TEXTS - provide support for opinions CENTRAL IDEAS REASONING & EVIDENCE Require written responses

Test design –Every adaptive test component must have: Adaptive session 10 1 written response for a WRITE BRIEF TEXTS target 2 written response in reading, 1 for a literary passage, 1 for an informational passage addressing either CENTRAL IDEAS REASONING & EVIDENCE

Chosen adaptively, based on best information Standalone written response items are chosen like any other polytomous item using best expected information of the item as a whole Passages are chosen from by passage sets of items. The passage with the highest expected information value is selected from the set of passages with a written response item. No change to ability estimate is made. The test proceeds adaptively from that point. How do these fit in a CAT session? 11

Responses are scored separately Human or AI scoring Scored responses are combined with CAT and PT responses to make up complete test event Overall test scores are calculated from all scored responses using IRT parameters How are they scored and combined with other responses? 12

AI Scoring Issues – AI scoring is controversial, particularly with higher ed and in the ELA educator community – Reliability for short answers is not quite there, although improving rapidly – AI engines aren’t built to be a real-time plug-in Promising results – Some items can be scored reliably enough to be used on summative tests (but not others). – If you want to use AI engines, have the AI experts involved in task development. AI engines can aid in finding exemplars & outliers – Can achieve high accuracy when essays are scored by once by hand, once by AI—higher than 2 humans 13

14 The SA items tend to be difficult Long administration time Expensive to administer and score Don’t contribute to ongoing score estimate Easy stand-alone items may be overexposed Items embedded in passages will often be mismatched informationally Some risks 14

15 Everyone is in favor of getting out of the MC box. Allows the kind of skill integration promoted in current instruction For high-stakes tests, prevents the proxy from becoming the objective. Some Advantages 15

Thank you for your attention Questions? Contact marty.mccall@smarterbalanced.orgmarty.mccall@smarterbalanced.org 16

The use of asynchronously scored items in adaptive test sessions. Marty McCall Smarter Balanced Assessment Consortium 1 2015 CCSSO NCSA San Diego CA.

Similar presentations

Presentation on theme: "The use of asynchronously scored items in adaptive test sessions. Marty McCall Smarter Balanced Assessment Consortium 1 2015 CCSSO NCSA San Diego CA."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

The use of asynchronously scored items in adaptive test sessions. Marty McCall Smarter Balanced Assessment Consortium 1 2015 CCSSO NCSA San Diego CA.

Similar presentations

Presentation on theme: "The use of asynchronously scored items in adaptive test sessions. Marty McCall Smarter Balanced Assessment Consortium 1 2015 CCSSO NCSA San Diego CA."— Presentation transcript:

Similar presentations

About project

Feedback