1 Scoring Provincial Large-Scale Assessments María Elena Oliveri, University of British Columbia Britta Gundersen-Bryden, British Columbia Ministry of.

Slides:



Advertisements
Similar presentations
Performance Assessment
Advertisements

Project-Based vs. Text-Based
K-6 Science and Technology Consistent teaching – Assessing K-6 Science and Technology © 2006 Curriculum K-12 Directorate, NSW Department of Education and.
You can use this presentation to: Gain an overall understanding of the purpose of the revised tool Learn about the changes that have been made Find advice.
What is moderation and why should we moderate?
Benchmark Assessment Item Bank Test Chairpersons Orientation Meeting October 8, 2007 Miami-Dade County Public Schools Best Practices When Constructing.
 Reading School Committee January 23,
Holistic Scoring in Social Studies Curriculum, Instruction, and Instructional Technology Team New York State Education Department Albany, NY 2005.
Minnesota Manual of Accommodations for Students with Disabilities Training Guide
Consistency of Assessment
1 Core Module Three – The Summative Report Core Module Three: The Role of Professional Dialogue and Collaboration in the Summative Report.
Principles of High Quality Assessment
Minnesota Manual of Accommodations for Students with Disabilities Training Guide
Grade 12 Subject Specific Ministry Training Sessions
Understanding Validity for Teachers
Universal Screening and Progress Monitoring Nebraska Department of Education Response-to-Intervention Consortium.
Monitoring Accommodations in South Dakota Linda Turner Special Education Programs.
Performance-Based Assessment June 16, 17, 18, 2008 Workshop.
Principles of Assessment
February 8, 2012 Session 3: Performance Management Systems 1.
Our Leadership Journey Cynthia Cuellar Astrid Fossum Janis Freckman Connie Laughlin.
Overall Teacher Judgements
2012 Secondary Curriculum Teacher In-Service
Assessment Group for Provincial Assessments, June Kadriye Ercikan University of British Columbia.
PDHPE K-6 Using the syllabus for consistency of assessment © 2006 Curriculum K-12 Directorate, NSW Department of Education and Training.
Holistic Scoring: The Grade 8 Intermediate- Level Social Studies Test Curriculum, Instruction, and Instructional Technology Team New York State Education.
Adolescent Literacy – Professional Development
ASSESSMENT IN EDUCATION ASSESSMENT IN EDUCATION. Copyright Keith Morrison, 2004 PERFORMANCE ASSESSMENT... Concerns direct reality rather than disconnected.
District Workforce Module Preview This PowerPoint provides a sample of the District Workforce Module PowerPoint. The actual Overview PowerPoint is 62 slides.
Leadership: Connecting Vision With Action Presented by: Jan Stanley Spring 2010 Title I Directors’ Meeting.
Classroom Assessments Checklists, Rating Scales, and Rubrics
Classroom Assessment A Practical Guide for Educators by Craig A
2 The combination of three concepts constitutes the foundation for results: 1) meaningful teamwork; 2) clear, measurable goals; and 3) regular collection.
Workshop 3 Early career teacher induction: Literacy middle years Workshop 3 Literacy teaching and NSW syllabus 1.
EDU 385 Education Assessment in the Classroom
Alaska Staff Development Network – Follow-Up Webinar Emerging Trends and issues in Teacher Evaluation: Implications for Alaska April 17, :45 – 5:15.
Developing Assessments for and of Deeper Learning [Day 2b-afternoon session] Santa Clara County Office of Education June 25, 2014 Karin K. Hess, Ed.D.
FEBRUARY KNOWLEDGE BUILDING  Time for Learning – design schedules and practices that ensure engagement in meaningful learning  Focused Instruction.
CommendationsRecommendations Curriculum The Lakeside Middle School teachers demonstrate a strong desire and commitment to plan collaboratively and develop.
1. Housekeeping Items June 8 th and 9 th put on calendar for 2 nd round of Iowa Core ***Shenandoah participants*** Module 6 training on March 24 th will.
Illustration of a Validity Argument for Two Alternate Assessment Approaches Presentation at the OSEP Project Directors’ Conference Steve Ferrara American.
Workshops to support the implementation of the new languages syllabuses in Years 7-10.
Record Keeping and Using Data to Determine Report Card Markings.
Performance-Based Assessment HPHE 3150 Dr. Ayers.
March Madness Professional Development Goals/Data Workshop.
The Conceptual Framework: What It Is and How It Works Linda Bradley, James Madison University Monica Minor, NCATE April 2008.
Validity Validity is an overall evaluation that supports the intended interpretations, use, in consequences of the obtained scores. (McMillan 17)
NCATE STANDARD I STATUS REPORT  Hyacinth E. Findlay  March 1, 2007.
KVEC Presents PGES Observation Calibration Are You On Target?
Alternative Assessment Chapter 8 David Goh. Factors Increasing Awareness and Development of Alternative Assessment Educational reform movement Goals 2000,
Student Learning Objectives 1 SCEE Summit Student Learning Objectives District Professional Development is the Key 2.
The Literacy and Numeracy Secretariat Le Secrétariat de la littératie et de la numératie October – octobre 2007 The School Effectiveness Framework A Collegial.
© Crown copyright 2008 Subject Leaders’ Development Meeting Spring 2009.
Minnesota Manual of Accommodations for Students with Disabilities Training January 2010.
Michigan Assessment Consortium Common Assessment Development Series Module 16 – Validity.
Background questionnaires: Why ask about social identity Ruth Childs (OISE-UT) Orlena Broomes (OISE-UT) 1.
Required Skills for Assessment Balance and Quality: 10 Competencies for Educational Leaders Assessment for Learning: An Action Guide for School Leaders.
WELCOME! DATA TEAM COACH TRAINING  Review the PLC Process, Roles and Expectations  Provide an opportunity for questions and possible responses.
Lessons Learned. Communication, Communication, Communication Collaborative effort Support of all stakeholders Teachers, Principals, Supervisors, Students,
CMSP: Finding our Mathematical Roots Lee Ann Pruske Beth Schefelker MTL Meeting October 18, 2011.
April 2011 Division of Academics, Performance and Support Office of Assessment High School Testing – Scoring Best Practices May 24, 2011 CFN 201.
EVALUATING EPP-CREATED ASSESSMENTS
Classroom Assessments Checklists, Rating Scales, and Rubrics
What is moderation and why should we moderate?
Classroom Assessment A Practical Guide for Educators by Craig A
Classroom Assessments Checklists, Rating Scales, and Rubrics
FEAPs (Florida Educator Accomplished Practices)
Deconstructing Standard 2a Dr. Julie Reffel Valdosta State University
GA Association of School Personnel Administrators
Presentation transcript:

1 Scoring Provincial Large-Scale Assessments María Elena Oliveri, University of British Columbia Britta Gundersen-Bryden, British Columbia Ministry of Education Kadriye Ercikan, University of British Columbia 1

Objectives Describe and Discuss –Five steps used to score provincial large-scale assessments (LSAs) –Advantages and challenges associated with diverse scoring models (e.g., centralized versus decentralized) –Lessons learned in British Columbia when switching from a centralized to a decentralized scoring model

3 LSAs are administered to collect data to evaluate efficacy of school systems, guide policy-making make decisions regarding improving student learning An accurate scoring process examined in relation to the purposes of the test, and the decisions the assessment data are intended to inform are key to obtaining useful data from these assessments Scoring Provincial Large-Scale Assessments

Accuracy in Scoring Essential to having accurate & meaningful scores is the degree to which scoring rubrics: –(1) appropriately and accurately identify relevant aspects of responses as evidence of student performance, –(2) are accurately implemented –(3) are consistently applied across examinees Uniformity in scoring LSAs is central to achieving comparability of students’ responses: ensure differences in results are attributable to differences among examinees’ performance rather than due to biases introduced by the use of differing scoring procedures A five-step process is typically used 4

Step One: “Test Design Stage” Design of test specifications –That match the learning outcomes or construct(s) assessed –Include particular weights & number of items needed to assess each intended construct 5

Step Two: “Scoring Open-Response Items” Decide which model to use to score open-response items: Centralized models are directly supervised by provincial Ministries or Departments of Education in a central location Decentralized models often take place across several locations & are performed by a considerably greater number of teachers; used for scoring medium to low-stakes LSAs 6

Step Three: “Preparing Training Materials” Identify common tools to train scorers, including: –Exemplars of students’ work demonstrating each of the scale points in the scoring rubric –Illustrate potential biases arising in the scoring process (e.g., differences in scores given to hand- vs. type-written essays) 7

Step Four: “Training of Scorers” Training occurs prior to scoring and can recur during the session itself, especially if the session spans more than one day A “train the trainer” approach is often used –a small cadre of more experienced team leaders are trained first, then they train other scorers who will actually score the responses Team leaders often make final judgement calls on the assignment of scores differing from exemplars Serves to reinforce common standards and consistency in the assignment of scores and leads to having fair and accurate scores 8

Step Five: “Monitoring Scores” Includes checks for inter-marker reliability, wherein a sample of papers is re-scored to check consistency in scoring across raters May serve as re-training or “re-calibration” activity, with raters discussing scores and rationales for their scoring procedures 9

The Foundation Skills Assessment The Foundation Skills Assessment (FSA) will be used as a case study to illustrate advantages and challenges associated with switching from a centralized and decentralized scoring model The FSA assess Grade 4 and 7 students’ skills in reading, writing and numeracy Several changes made to the FSA in 2008 as a response to stakeholders’ demands to have more meaningful LSAs that informed classroom practice 10

Changes to the FSA Earlier administration – from May to February Online administration of closed-response sections Parents or guardians received child’s open-response test portions & summary statement of reading, writing and numeracy skills Scoring model changed from a centralized to a decentralized model Ministry held “train the trainer” workshops to prepare school district personnel to organize and conduct local scoring sessions School districts could decide how to conduct scoring sessions –score individually, in pairs or in groups –double-score only a few, some or all the responses 11

Advantages of a Decentralized Model Professional Development –A decentralized model allowed four times as many teachers to work with scoring rubrics and exemplars –Educators were able to develop a deeper understanding of provincial standards and expectations for student achievement –If scorers are educators, they may later apply knowledge of rubrics and exemplars in their classroom practice and school environments and consider the performance of their own students in a broader provincial context 12

Advantages of a Decentralized Model Earlier return of test results & earlier provision of feedback to teachers, students and the school –More immediate feedback may lead to improving learning and guiding student teaching –Data informs teachers about students’ strengths and areas of improvement in relation to provincial standards –May be helpful in writing school plans and targeting the areas upon which particular schools may focus 13

Challenges of a Decentralized Scoring Model Increased difficulty associated with –Less time allocated to implementing cross-check procedures –Decreased standardization of scoring instructions given to raters –Increased costs (higher number of teachers scoring) –Reduced training time 14

Potential Solutions Provide teachers with adequate training time –e.g., one to two days of training prior to scoring the assessments Increase discussion among teachers, which may involve reviewing exemplars falling in between scale points in the rubric Have table leaders –e.g., teachers with prior scoring experience Re-group teachers to verify difficulties or uncertainties related to the scoring process 15

Final Note Closer collaboration among educators and Ministries and Departments of Education may lead to improved tests as educators bring their professional experience of how students learn in the classroom to bear on test design itself Strong alignment between the overall purposes of the test, the test design and the scoring model used may add value to score interpretation and subsequent use of assessment results 16

17 Thank you