Download presentation
Presentation is loading. Please wait.
Published byAnnis Young Modified over 9 years ago
1
A State Perspective on Enhancing Assessment & Accountability Systems through Systematic Integration of Computer Technology Joseph A. Martineau, Ph.D. Vincent J. Dean, Ph.D. Michigan Department of Education Presentation at the tenth annual Maryland Assessment Conference October 2010
2
The Michigan Stage Michigan offers an interesting perspective ◦ Pilot in 2006 ◦ Pilot in 2011 (English Language Proficiency) ◦ Pilot in 2012 (Alternate Assessments) ◦ Pilots leading up to operational adoption of SMARTER/Balanced Assessment Consortium products in 2014/15 ◦ Constitutional amendment barring unfunded mandates
3
The National Stage Survey of state testing directors (+D.C.) ◦ 50 responses + one investigation via state department of education website ◦ 7 of 51 states have no CBT initiatives ◦ 44 of 51 states have current CBT initiatives, including: Operational online assessment Pilot online assessment Plans for moving online
4
The National Stage, continued… Survey of state testing directors (+D.C.) ◦ CBT initiatives include Teacher entry of student responses online Student entry of responses online P&P replication CAT AI scoring MC via internet, CR via paper and pencil General populations (grade level and end of course) Special populations (eases infrastructure concerns) Modified Alternate English language proficiency Online repository and scoring of portfolio materials Item banks for flexible unit-specific interim assessment ◦ Initiatives are all over the board, piecemeal for the most part
5
The National Stage, continued… Survey of state testing directors (+D.C.) ◦ Of 44 states with some initiative 26 states currently administer large-scale general populations assessments online 15 states have plans to begin (or expand) online administration of large-scale general populations assessments 12 states currently administer special populations assessments online 3 states have plans to begin (or expand) online administration of special populations assessments
6
The National Stage, continued… Survey of state testing directors (+D.C.) ◦ Of 44 states with some initiative 7 states currently use Artificial Intelligence (AI) scoring of constructed response items 4 states currently use Computer Adaptive Testing (CAT) technology for general populations assessment, with one more moving in that direction soon 0 states currently use CAT technology for special populations assessment 10 states offer online interim/benchmark assessments 10 states offer online item banks accessible to teachers for creating “formative”/interim/benchmark assessments tailored to unique curricular units
7
The National Stage, continued… Survey of state testing directors (+D.C.) ◦ Of 44 states with some initiative 6 states offer computer based testing (CBT) options on general populations assessment as an accommodation for special populations 4 states report piloting and administration of innovative item types (e.g. flash-based modules providing mathematical tools such as protractors, rulers, compasses) 16 states offer End of Course (EOC) tests online, or are implementing online EOC in the near future 6 states report substantial failure of a large-scale online testing resulting in cessation of computer based testing Some have recovered and are moving back online Others have no plans to return to online testing
8
The National Stage, continued… Development of the Common Core of State Standards (CCSS) ◦ Content standards (not a test) English Language Arts (K-12) Mathematics (K-12) ◦ Developed with backing from 48 states ◦ Adoption tally Adopted in full by 39 states Adoption declined in 5 states Adoption expected by remaining 6 states by end of 2011
9
The National Stage, continued… Assessment Consortia ◦ Race to the Top Assessment Competition ◦ Development of an infrastructure and content for a common assessment in measuring CCSS in English Language Arts and Mathematics ◦ Two consortia SMARTER/Balanced Assessment Consortium (SBAC) Partnership for the Assessment of Readiness for College and Career (PARCC)
10
The National Stage, continued… The consortia: ◦ SMARTER/Balanced 31 states 17 governing states CAT beginning in 2014-2015 ◦ PARCC 26 states 11 governing states CBT beginning in 2014-15
11
Consortia Membership
12
The National Stage, Summary State efforts have been, with few exceptions, piecemeal by… ◦ Program ◦ Content area ◦ Grade level ◦ Type of assessment (summative, interim, formative) ◦ Population (general, modified, alternate) Most states are… ◦ Involved in some kind of pilot or operational use ◦ Intending to be operational on a large scale by 2014-2015 ◦ Experiencing budget crises… ◦ That make transitions difficult ◦ That make efficiencies of technology integration critical A strong need to take a systems look at how to integrate computer technology into assessment and accountability systems Technology integration is a significant opportunity to provide a platform that connects all initiatives
13
The Organizing Framework for this Paper From… ◦ Martineau, J. A., & Dean, V. J. (in press). Making Assessment Relevant to Students, Teachers, and Schools. In V. Shute & Becker, B.J. (Eds.). Innovative Assessment for the 21st Century: Supporting Educational Needs. Springer-Verlag, NY. ◦ Figure 1
17
Entry Points
18
Outcomes
19
The Organizing Framework for this Paper, continued… With a comprehensive system in place, it is possible to identify comprehensively where integration of technology will enable and enhance the system Components identified with bold outlines on the next slide
21
Starting from the Bottom Up Professional Development Current lack of pre-service and in-service balanced assessment training Need for rapid scale up to millions of educators on a small budget
22
Technology Integration into Pre- and In-Service Professional Development Scaling up is only feasible with integral use of technological tools High-quality online courses Social networking among educators Live tele-coaching Electronic (graphic, audio, video) capture for distance streaming of materials, plans, and instructional practice vignettes over high speed networks To facilitate discussion regarding instructional practice between Candidates and instructor/coach Candidates and mentor Mentors and instructor/coach For example, repurposing Idaho’s special portfolio submission system for educator training
23
Moving to Content & Process Standards Start a limited set of high school exit standards based on college and career readiness From that, develop K-12 content/process standards in a logical progression to college and career readiness Based on the learning progressions and K-12 content/process standards, develop model instructional materials
24
Model Instructional Materials Clearinghouse Develop online clearinghouse of materials for model curriculum and instructional units ◦ Lesson plans ◦ Lesson materials ◦ Video vignettes of high quality instructional practices based on those units ◦ Flexible platform to accept user submission in a variety of formats ◦ User moderated ratings of submission quality
25
Moving to Assessment Practices Before actually moving into assessment practices, it is important to classify content standards in three ways: ◦ Timing On-demand, time limited On-demand, not time limited Feedback-looped ◦ Task type Selected response Short constructed response Extended constructed response Performance events ◦ Setting Classroom only Classroom and secure Based on these classifications, several types of assessment take place
26
Assessment Practices, continued… Start with model classroom materials and tools Initial development of model materials, vignettes, strategies, and tools sets the stage for…
27
Educator submissions to Populate online clearinghouse of materials for model classroom assessment practice units ◦ Summative assessment materials ◦ Formative assessment vignettes, strategies, and tools ◦ Flexible platform to accept user submission in a variety of formats ◦ User moderated ratings of submission quality Non-secure item bank generated by educators ◦ Platform support various item types ◦ User moderated ratings of submission quality ◦ Large enough that security is not a concern Empirically designed MC items Fully customizable
28
Which in Turn Leads to… Implementation of formative assessment practices enhanced by technological aids, such as ◦ Response devices (e.g., clickers, tablet computers, phones) ◦ Rapid response to teacher queries over online systems ◦ Remote response to formative queries (e.g. rural areas and cyberschools)
29
Which in Turn Leads to… Selection or development of summative classroom assessments ◦ On-demand micro-benchmark (small unit) assessments ◦ From non-secure item bank generated by educators ◦ Customizable to fit specific lesson plans/curricular documents ◦ Instant reporting for diagnostic/instructional intervention purposes ◦ Inform targeted professional development in real time ◦ RESULTS NOT used for large-scale accountability purposes (belongs to the schools and teachers)
30
With High-Quality Classroom Assessment Practices in Place Large-scale assessment now makes sense, with three types of large-scale assessment
31
Large-Scale Assessment, continued… Start with classroom-based For content standards best measured using “feedback-looped” tasks ◦ Meaning content standards (likely higher order) that are best accomplished with a feedback cycle between teacher and student
32
Portfolio Development & Submission, continued… Creation of portfolio includes scannable materials, electronic documents, and/or audio/video of student performance Submitted via a secure online portfolio repository (e.g., Idaho’s alternate assessment portfolio submission site) Unlikely to be scorable using AI, therefore, scored on a distributed online scoring system that prevents teachers from scoring their own students’ portfolios (e.g., Idaho’s alternate assessment portfolio scoring site Can be scored both for final product and development over time
33
Moving to Secure Online Testing For content standards that do not require “feedback-looped” tasks Dynamic online CAT assessments ◦ Based on dynamically selected clusters of content standards covered in instructional units ◦ Scaled to the same scale as the end-of-year assessment, with cut scores for mastery/proficiency ◦ Can move students on to higher grade level content once mastery/proficiency of all grade level content is demonstrated through unit assessments ◦ What Race to the Top Assessment Competition calls “Through-Course Assessment”
34
Moving to Secure Online Testing ◦ What Race to the Top Assessment Competition calls “Through-Course Assessment” ◦ Provides advance look at trajectory toward proficiency ◦ Provides multiple opportunities to demonstrate proficiency ◦ More equitable for high-stakes accountability purposes ◦ Useful for mid-year correction in instructional practice (e.g. Response to Intervention) ◦ Useful for placement purposes of newly arrived students ◦ Useful for differentiated instruction ◦ Anticipate increase educator motivation (because of timely information)
35
Moving to Secure Online Testing Beyond traditional CAT/CBT AI Scoring of constructed response items Technology enhanced items Performance tasks/events (through simulations) Gaming type items
36
Moving to Secure Online Testing For three groups of students… 1.Initial scaling and calibration group 2.Ongoing randomly selected validation groups (to validate that students proficient on all required unit tests retain proficiency at the end of the year) 3.Students who do not achieve proficiency on all required unit tests Final opportunity to demonstrate overall proficiency if proficiency was in question on any single unit assessment Allows for the elimination of a single end-of-year test for most students
37
Scoring Maximize objective scoring by ◦ Automated scoring of objective items ◦ AI scoring of extended written response items, technology enhanced items, and performance tasks wherever possible ◦ Distributed hand-scoring of tasks not scorable using AI
38
Distributed Scoring as Professional Development Human scorers taken from ranks of educators ◦ Online training on hand-scoring ◦ Online certification as a hand-scorer ◦ Online monitoring of rater performance ◦ Validation hand-scoring of samples of AI-scored tasks Our experience with teacher-led scoring and range-finding indicates that it is some of the best professional development that we provide to educators
39
Reporting For the most part, reports are difficult to read and poorly used Need online reporting of all scores for all stakeholders, including: ◦ Policymakers (aggregate) ◦ Administrators (aggregate and individual) ◦ Teachers (aggregate and individual) ◦ Parents (aggregate and individual) ◦ Students (individual)
40
Reporting Portal Reporting portal needs to be able to integrate reports from classroom metrics all the way to large-scale secure assessment metrics
41
Reporting Portal Reporting cycles depend on the item types and application of AI scoring. ◦ Immediate where possible ◦ Expedited hand-scoring (shifting funding focus from printing, shipping, and scanning to on-demand hand-scoring)
42
Where the Rubber Hits the Road This is a nice system design (if we do say so ourselves), but what are the impediments to implementation? Infrastructure ◦ LEA hardware and bandwidth capacity ◦ Assessment vendor capacity ◦ Moving from piecemeal components to an integrated, coherent system ◦ Development of educator-moderated clearinghouses ◦ Development of educator-moderated item bank
43
Where the Rubber Hits the Road Security ◦ The more high-stakes the system, the more likely security breaches become ◦ Critical need for training on user roles ◦ Critical need for training on data use, since data will become much more readily available across the board ◦ Security controls versus open-source and maximal access
44
Where the Rubber Hits the Road Funding ◦ Very high initial startup investment ◦ Dual systems during development and initial implementation ◦ Ramping up LEA technology systems to be capable of working within the system
45
Where the Rubber Hits the Road Sustainability ◦ Requires perpetual investment in administration ◦ Development is only the start (e.g. sustainability concerns regarding RTTT-funded assessment consortia) ◦ Requires early success and public understanding of the benefits of the system weighed against ongoing costs ◦ Recurring hardware/software technology upgrade costs for LEAs ◦ Recurring hardware/software technology maintenance costs for central IT systems
46
Where the Rubber Hits the Road Local Control ◦ This kind of system is only possible to create with significant funding and local buy-in ◦ No single state (let alone district) could afford the cost of development and implementation ◦ Consortia are imperative to creating such a system Consortia can tend toward self-perpetuation rather than serving their members Consortia cannot ignore local nuances Consortia cannot ignore reasonable needs for flexibility Consortia must monitor and maximize member investment
47
Where the Rubber Hits the Road Building an appetite for online systems ◦ Implementation may occur piecemeal, but should be undertaken within a framework for a coherent and complete system ◦ Each piece when implemented needs to be implemented in such a way that local educators and policymakers see a positive impact on the educational system, e.g., Immediate turnaround of results Connection between family and school Improved instructional practice Facilitation of differentiated instruction
48
Recommendations for Future Directions System has the potential to make us data-rich and analysis-poor ◦ Build local (SEA and LEA) capacity for appropriate analysis (possibly through re-defining positions that might be eliminated through consortia services) ◦ New practices (e.g. through-course, innovative items types, AI scoring) will require a significant research and validation agenda, including Equating Comparability Standard setting
49
Recommendations for Future Directions System has the potential to make educators and students data rich ◦ Portfolios of assessment results and products as evidence of students’ college and career readiness ◦ Portfolios of assessment results and products as evidence of teacher classroom practices and effectiveness
50
Recommendations for Future Directions Financial incentives from ARRA/RTTT have provided the impetus for some of these initiative to get started Sustainability needs to be a focus both within and across states To maximize cross-state focus, we recommend continued significant funding of initiatives through ESEA reauthorization, Enhanced Assessment Grants, and other competitive/formula funding opportunities
51
Recommendations for Future Directions Scoring of competitive consortium applications should be weighted toward… ◦ The development of integrated systems across all aspects of assessment & accountability ◦ Significant and rigorous research, development, and evaluation of the validity and impact (intended and unintended consequences) of system development and implementation Formula funding should stipulate collaboration in system development Use of formula funding guarantees… ◦ Continued focus on students with the greatest needs ◦ Access to quality systems for states without strong resources for writing competitive grants
52
Contact Information Joseph A. Martineau, Ph.D. ◦ Director of Assessment & Accountability ◦ martineauj@michigan.gov Vincent J. Dean, Ph.D. ◦ State Assessment Manager ◦ deanv@michigan.gov Michigan Department of Education
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.