Stephen C. Court Educational Research and Evaluation, LLC A Presentation at the First International Conference on Instructional Sensitivity Achievement.

Slides:

Advertisements

Similar presentations

Stephen C. Court Presented at

Advertisements

Assessing Student Performance

Performance Assessment

Creating and Using Rubrics for Evaluating Student Work Prepared by: Helen Jordan Mathematics Coordinator Normandie Avenue Elementary.

Classroom Instruction That Works Robert Marzano, Debra Pickering, and Jane Pollock August 19, 2008.

Testing for Tomorrow Growth Model Testing Measuring student progress over time.

Wynne Harlen. What do you mean by assessment? Is there assessment when: 1. A teacher asks pupils questions to find out what ideas they have about a topic.

Summative Assessment Kansas State Department of Education ASSESSMENT LITERACY PROJECT1.

NIET Teacher Evaluation Process

Chapter 1 What is listening?

Designing Instruction Objectives, Indirect Instruction, and Differentiation Adapted from required text: Effective Teaching Methods: Research-Based Practice.

Using Multiple Choice Tests for Assessment Purposes: Designing Multiple Choice Tests to Reflect and Foster Learning Outcomes Terri Flateby, Ph.D.

Uses of Language Tests.

C R E S S T / U C L A Improving the Validity of Measures by Focusing on Learning Eva L. Baker CRESST National Conference: Research Goes to School Los Angeles,

What does the Research Say About... POP QUIZ!!!. The Rules You will be asked to put different educational practices in order from most effective to least.

Assessment in Language Teaching: part 2 Today’s # 24.

ICT Integration in the Curriculum Planning for. Section One: Course participants will be shown how to:  Assess their school’s current situation in relation.

Reading Maze General Outcome Measurement TES Data Meeting April, rd and 4 th Grades Cynthia Martin, ARI Reading Coach.

Creating Assessments with English Language Learners in Mind In this module we will examine: Who are English Language Learners (ELL) and how are they identified?

Overall Teacher Judgements

Leveling the Playing Field With Technology A New Approach to Differentiated Instruction.

Building Effective Assessments. Agenda  Brief overview of Assess2Know content development  Assessment building pre-planning  Cognitive factors  Building.

Interpreting Assessment Results using Benchmarks Program Information & Improvement Service Mohawk Regional Information Center Madison-Oneida BOCES.

Curriculum and Learning Omaha Public Schools

Evaluating the Vermont Mathematics Initiative (VMI) in a Value Added Context H. ‘Bud’ Meyers, Ph.D. College of Education and Social Services University.

Jasmine Carey CDE Psychometrician Interpreting Science and Social Studies Assessment Results September 2014.

 Closing the loop: Providing test developers with performance level descriptors so standard setters can do their job Amanda A. Wolkowitz Alpine Testing.

Assessment Practices That Lead to Student Learning Core Academy, Summer 2012.

Stephen C. Court Educational Research and Evaluation, LLC A Presentation at the First International Conference on Instructional Sensitivity Achievement.

Measuring Complex Achievement

NRTs and CRTs Group members: Camila, Ariel, Annie, William.

Reporting progress and achievement: Creating informative and manageable technology reports Selena Hinchco.

Week 5 Lecture 4. Lecture’s objectives  Understand the principles of language assessment.  Use language assessment principles to evaluate existing tests.

Classroom Evaluation & Grading Chapter 15. Intelligence and Achievement Intelligence and achievement are not the same Intelligence and achievement are.

+ The continuum Summative assessment Next steps. Gallery Walk – the Bigger Picture Take one post it of each of the 3 colours. Walk around a look at the.

Grading and Analysis Report For Clinical Portfolio 1.

Preliminary Data: Not a Final Accountability Document1 SAISD TAKS Performance Board Work Session June 2004 Office of Research, Evaluation,

Staff Science PD Session Two. Programme 1. Today’s session 2. Opening Experiment 3. Science: What is it all about? 4. The Science Exemplars 5. Where to.

Assessment and Testing

NECAP Results and Accountability A Presentation to Superintendents March 22, 2006.

Standards Based Report Cards Albert D. Lawton Intermediate School.

Georgia will lead the nation in improving student achievement. 1 Georgia Performance Standards Day 3: Assessment FOR Learning.

Backwards Design. Activity-Oriented Teaching Many teachers engage in “activity-oriented” teaching.

Introduction to Item Analysis Objectives: To begin to understand how to identify items that should be improved or eliminated.

1 Scoring Provincial Large-Scale Assessments María Elena Oliveri, University of British Columbia Britta Gundersen-Bryden, British Columbia Ministry of.

Qatar Comprehensive Educational Assessment (QCEA) 2008: Discussion Session For QCEA Support.

LITERACY-BASED DISTRICT-WIDE PROFESSIONAL DEVELOPMENT Aiken County Public School District January 15, 2016 LEADERS IN LITERACY CONFERENCE.

English-Language Arts Content Standards for California By Ashleigh Boni & Christy Pryde By Ashleigh Boni & Christy Pryde.

COURSE AND SYLLABUS DESIGN

White Pages Team Grey Pages Facilitator Team & Facilitator Guide for School-wide Reading Leadership Team Meetings Elementary.

Presentation to the Nevada Council to Establish Academic Standards Proposed Math I and Math II End of Course Cut Scores December 22, 2015 Carson City,

KHS PARCC/SCIENCE RESULTS Using the results to improve achievement Families can use the results to engage their child in conversations about.

Dept. of Community Medicine, PDU Government Medical College,

Outcomes By the end of our sessions, participants will have…  an understanding of how VAL-ED is used as a data point in developing professional development.

Inspiring today’s children for tomorrow’s world Early Years Foundation Stage Assessment Procedure 2016.

What does the Research Say About . . .

TEKS-Based Instruction

Assessment and Reporting Without Levels February 2016

Quarterly Meeting Focus

What does the Research Say About . . .

Update on Data Collection and Reporting

Iowa Teaching Standards & Criteria

Interpreting Science and Social Studies Assessment Results

Curriculum 2.0: Standards-Based Grading and Reporting

PLCs Professional Learning Communities Staff PD

TOPIC 4 STAGES OF TEST CONSTRUCTION

Dept. of Community Medicine, PDU Government Medical College,

EDUC 2130 Quiz #10 W. Huitt.

Presentation transcript:

Stephen C. Court Educational Research and Evaluation, LLC A Presentation at the First International Conference on Instructional Sensitivity Achievement and Assessment Institute University of Kansas, Lawrence, Kansas, November 13-15, 2013

Instructional sensitivity may reside not only in items and tasks. While crunching the numbers for the results I shared yesterday, the data patterns raised several ideas regarding areas for future research. Here are some of them.

Typically, developers select a wide range of item difficulties for the test overall and even within subscales. This was desirable in NRT context because doing so helped to separate students. But for purposes of school accountability and teacher evaluation, we need to revisit this notion. If we use standards-based (CRT) selection criteria – all items of similar difficulty – do the indicator and total-test results become more sensitive to instruction?

Typically, items that are too easy (p-value >.85 or.90) are deleted from final test forms. groups Yet, these are the items that tend to discriminate best between effective and less effective instruction… At the classroom level and above, the most poorly taught groups of students tend to be the ones who get the most items wrong. Conversely, the best taught groups tend to earn the highest scores. RQ: Is there an “ideal” p-value for items used on evaluative tests?

In Kansas, a popular report is the instructional planning graph.

For each indicator, use ROC analysis to set optimum cut scores that maximize instructional sensitivity. For example, given an indicator measured by four items, what cut score (number correct) is most sensitive to instruction?

Examine how item, indicator, and total test sensitivity varies across… N-size AYP or CCR subgroups other disaggregations of interest Also, check concordance between confirmatory (intended) and exploratory (actual) factor patterns: Same number of factors? Items load onto the intended factors? Greater concordance  greater sensitivity.

Who remembers the formula for standard error of measurement (SEM) or for the standard error of the mean? Do you know where to find each formula? Do you understand what they are, when each is applicable, and how to use them? For an elementary or middle school student, such a problem may exist for the difference between the area and the circumference of a circle.

In turn, if constructs were redefined to involve inquiry…and if tests were administered in an “enriched” environment … Would teachers widen the curriculum and teach procedural and inquiry skills rather than just drilling-and-killing declarative knowledge for the sake of the test? In addition, if we exchange the sterile environment for an enriched environment, might such an authentic testing approach diminish incentives to “teach the test?” Further, would such an approach be more engaging for students in terms of (a) learning and (b) being tested? Would achievement gaps narrow with respect to test results and actual learning? Would such an approach spawn an upward spiral of increased success, self- efficacy, and motivation?

We need more research into… Items, tasks, reading passages, and other stimuli… But also into all other aspects of testing, instruction, and educational excellence. Just as important: We need to communicate our research findings not only among ourselves (the measurement, assessment, and research community) but also to policy-makers, practitioners, and stakeholders at the local, state, and national levels.