Enhancing the Technical Quality of the North Carolina Testing Program: An Overview of Current Research Studies Nadine McBride, NCDPI Melinda Taylor, NCDPI.

Slides:

Advertisements

Similar presentations

North Carolina Department of Public Instruction Division of Accountability Services July 24, Hope Tesh-Blum Division of Accountability Services.

Advertisements

Self-Study Tool for Alaska Schools Winter Conference January 14, 2010 Jon Paden, EED Deborah Davis, Education Northwest/Alaska Comprehensive Center.

MSCG Training for Project Officers and Consultants: Project Officer and Consultant Roles in Supporting Successful Onsite Technical Assistance Visits.

Fairness in Testing: Introduction Suzanne Lane University of Pittsburgh Member, Management Committee for the JC on Revision of the 1999 Testing Standards.

Action Research Not traditional educational research often research tests theory not practical Teacher research in classrooms and/or schools/districts.

Evaluation for 1st Year Grantees Shelly Potts, Ph.D. Arizona State University

Determining Validity For Oklahoma’s Educational Accountability System Prepared for the American Educational Research Association (AERA) Oklahoma State.

In Today’s Society Education = Testing Scores = Accountability Obviously, Students are held accountable, But also!  Teachers  School districts  States.

Evidence-Based Bullying Prevention Program RISCA Annual Conference Bryant University April 9, 2011 Karen Carnevale, Elementary School Counselor Leslie.

Summary of Results from Spring 2014 Presented: 11/5/14.

PARCC Accommodation: Text-to-Speech, Screen Reader Version, ASL Video, Human Reader/Human Signer For the ELA/Literacy Assessment December 2014.

New Hampshire Enhanced Assessment Initiative: Technical Documentation for Alternate Assessments Consequential Validity Inclusive Assessment Seminar Elizabeth.

New Hampshire Enhanced Assessment Initiative: Technical Documentation for Alternate Assessments Standard Setting Inclusive Assessment Seminar Marianne.

Introduction & Background Laurene Christensen National Center on Educational Outcomes National Center on Educational Outcomes (NCEO)

1 Some Key Points for Test Evaluators and Developers Scott Marion Center for Assessment Eighth Annual MARCES Conference University of Maryland October.

Large Scale Assessment Conference June 22, 2004 Sue Rigney U.S. Department of Education Assessments Shall Provide for… Participation of all students Reasonable.

Principles of High Quality Assessment

New Hampshire Enhanced Assessment Initiative: Technical Documentation for Alternate Assessments Alignment Inclusive Assessment Seminar Brian Gong Claudia.

Understanding Validity for Teachers

Funding Opportunities at the Institute of Education Sciences Elizabeth R. Albro, Ph.D. Associate Commissioner Teaching and Learning Division National Center.

Monitoring Accommodations in South Dakota Linda Turner Special Education Programs.

Sneha Shah-Coltrane Director of Gifted Education and Advanced Programs OVERVIEW Credit by Demonstrated Mastery (CDM): Ensuring.

Aligning Academic Review and Performance Evaluation (AARPE)

An Update on Florida’s Charter Schools Program Grant: CAPES External Evaluation 2014 Florida Charter Schools Conference: Sharing Responsibility November.

Transportation leadership you can trust. presented to Transportation Planning Applications Committee (ADB50) presented by Sarah Sun Federal Highway Administration.

Becoming a Teacher Ninth Edition

The Department of Educational Administration Assessment Report School of Education and Human Services Carol Godsave, Chair, Assessment Coordinator.

1 Policy No Child Left Behind of 2001 HSP-C-005/State Board of Education –Annual Language Proficiency Assessment –No Exemptions –Same standard, Same content.

ERIKA HALL CENTER FOR ASSESSMENT PRESENTATION AT THE 2014 NATIONAL CONFERENCE ON STUDENT ASSESSMENT NEW ORLEANS JUNE 25, 2014 The Role of a Theory of Action.

1 Improving Administration in Your Financial Aid Office SFA’s Tools for Success.

Analyzing Reliability and Validity in Outcomes Assessment (Part 1) Robert W. Lingard and Deborah K. van Alphen California State University, Northridge.

1 Sarah McManus and Hope Tesh-Blum North Carolina Department of Public Instruction Division of Accountability Services/Testing Section North Carolina High.

NCEXTEND2 Assessments Mike Gallagher, NCDPI Nadine McBride, NCDPI Sheila Garner Brown, TOPS.

Accommodations in Oregon Oregon Department of Education Fall Conference 2009 Staff and Panel Presentation Dianna Carrizales ODE Mike Boyles Pam Prosise.

© 2014, Florida Department of Education. All Rights Reserved Annual District Assessment Coordinator Meeting VAM Update.

Classroom Assessment A Practical Guide for Educators by Craig A. Mertler Chapter 7 Portfolio Assessments.

Including Quality Assurance Within The Theory of Action Presented to: CCSSO 2012 National Conference on Student Assessment June 27, 2012.

Understanding customer expectations and perceptions

Comprehensive Educator Effectiveness: New Guidance and Models Presentation for the Special Education Advisory Committee Virginia Department of Education.

Comprehensive Educator Effectiveness: New Guidance and Models Presentation for the Virginia Association of School Superintendents Annual Conference Patty.

Issues in Selecting Assessments for Measuring Outcomes for Young Children Issues in Selecting Assessments for Measuring Outcomes for Young Children Dale.

IDEA and NCLB Standards-Based Accountability Sue Rigney, U.S. Department of Education OSEP 2006 Project Directors’ Conference.

Circuit Rider Training Program (CRTP) Circuit Rider Professional Association Annual General Meeting and Conference August 30, 2012.

Student assessment Assessment tools AH Mehrparvar,MD Occupational Medicine department Yazd University of Medical Sciences.

Evaluating Impacts of MSP Grants Hilary Rhodes, PhD Ellen Bobronnikov February 22, 2010 Common Issues and Recommendations.

What you need to know about changes in state requirements for Teval plans.

Office Of Standards, Assessment, & Reporting Update David Abrams Assistant Commissioner For Standards, Assessment and Reporting S/CDN September 23, 2004.

Evaluating Impacts of MSP Grants Ellen Bobronnikov Hilary Rhodes January 11, 2010 Common Issues and Recommendations.

Validity Validity is an overall evaluation that supports the intended interpretations, use, in consequences of the obtained scores. (McMillan 17)

Establishing the Validity of Test Accommodations for Students with Disabilities: A Collaborative of State-based Research CTEAG Project Summary of Accomplishments.

McGraw-Hill/Irwin © 2012 The McGraw-Hill Companies, Inc. All rights reserved. Obtaining Valid and Reliable Classroom Evidence Chapter 4:

Alternative Assessment Chapter 8 David Goh. Factors Increasing Awareness and Development of Alternative Assessment Educational reform movement Goals 2000,

A presentation of methods and selected results 1.

7/13/03Copyright Ed Lipinski and Mesa Community College, All rights reserved. 1 Research Methods Summer 2009 Using Survey Research.

Assessing Very Low-Achieving Children with Disabilities Using Large Scale Assessments Sue Rigney, U.S. Department of Education OSEP 2006 Project Directors’

Evaluate Phase Pertemuan Matakuliah: A0774/Information Technology Capital Budgeting Tahun: 2009.

Using State Tests to Measure Student Achievement in Large-Scale Randomized Experiments IES Research Conference June 28 th, 2010 Marie-Andrée Somers (Presenter)

Onsite Quarterly Meeting SIPP PIPs June 13, 2012 Presenter: Christy Hormann, LMSW, CPHQ Project Leader-PIP Team.

More Timely, Credible and Cost Effective Performance Information on Multilateral Partners Presented by: Goberdhan Singh Director of the Evaluation Division.

Report on the NCSEAM Part C Family Survey Batya Elbaum, Ph.D. National Center for Special Education Accountability Monitoring February 2005.

Michigan Assessment Consortium Common Assessment Development Series Module 16 – Validity.

So What is Going to be Happening with State Assessment for Students with Disabilities for 2007/2008? Peggy Dutcher Fall 2007 Assessment and Accountability.

Principles of Language Assessment

Consequential Validity

Reliability and Validity in Research

Week 3 Class Discussion.

Federal Policy & Statewide Assessments for Students with Disabilities

Assessment Literacy: Test Purpose and Use

Investigations into Comparability for the PARCC Assessments

Aligning Academic Review and Performance Evaluation (AARPE)

Presentation transcript:

Enhancing the Technical Quality of the North Carolina Testing Program: An Overview of Current Research Studies Nadine McBride, NCDPI Melinda Taylor, NCDPI Carrie Perkis, NCDPI

Overview Comparability Consequential validity Other projects on the horizon

Comparability Previous Accountability Conference presentations provided early results Research funded by an Enhanced Assessment Grant from the US Department of Education Focused on the following topics: –Translations –Simplified language –Computer-based –Alternative formats

What is Comparability? Not just “same score” Same content coverage Same decision consistency Same reliability & validity Same other technical properties (i.e., factor structure) Same interpretations of test results, with the same level of confidence

Goal Develop and evaluate methods for determining the comparability of scores from test variations to scores from the general assessments The same inferences should be able to be made, with the same level of confidence, from variations of the same test.

Research Questions What methods can be used to evaluate score comparability? What types of information are needed to evaluate score comparability? How do different methods compare in the types of information about comparability they provide?

Products Comparability Handbook –Current Practice State Test Variations Procedures for Developing Test Variations and Evaluating Comparability –Literature Reviews –Research Reports –Recommendations Designing Test Variations Evaluating Comparability of Scores

Results - Translations Replication methodology helpful when faced with small samples and widely different proficiency distributions –Gauge variability due to sampling (random) error –Gauge variability due to distribution differences Multiple methods for evaluating structure are helpful Effect size criteria helpful for DIF Congruence b/w structural & DIF results

Results – Simplified Language Carefully documented and followed development procedures focused on maintaining the item construct can support comparability arguments. Linking/equating approaches can be used to examine and/or establish comparability. Comparing item statistics using the non-target group can provide information about comparability.

Results – Computer-based Propensity score matching produced similar results to studies using within-subjects samples. Propensity score method provides a viable alternative to the difficult-to-implement repeated measures study. Propensity score method is sensitive to group differences. For instance, the method performed better when 8 th and 9 th grade groups were matched separately.

Results – Alternative Formats The burden of proof is much heavier for this type of test variation. A study based on students eligible for the general test can provide some, but not solid, evidence of comparability. Judgment-based studies combined with empirical studies are needed to evaluate comparability. More research is needed in methods for evaluating what constructs each test type is measuring.

Lessons Learned It takes a village… –Cooperative effort of SBE, IT, districts and schools to implement special studies –Researchers to conduct studies, evaluate results –Cooperative effort of researchers and TILSA members to review study design and results –Assessment community to provide insight and explore new ideas

Consequential Validity What is consequential validity? –Amalgamation of evidence regarding the degree to which use of test results have social consequences –Can be both positive and negative; intended and unintended

Who’s Responsibility? Role of the Test Developer versus the Test User? Responsibility and roles are not clearly defined in the literature State may be designated as both a test developer and a user

Test Developer Responsibility Generally responsible for… –Intended effects –Likely side effects –Persistent unanticipated effects –Promoted use of scores –Effects of testing

Test Users’ Responsibility Generally responsible for… –Use of scores the further from the intended uses, the greater the responsibility

Role of Peer Review Element 4.1 –For each assessment, including the alternate assessment, has the state documented the issue of validity…. with respect to the following categories: g) has the state ascertained whether the assessment produces intended and unintended consequences?

Study Methodology Focus Groups –Conducted in five regions across the state –Led by NC State’s Urban Affairs –Completed in Dec 09 and Jan 10 –Input of teachers and administration staff –Included large, small, rural, urban, suburban schools

Study Methodology Survey Creation –Drafts currently modeled after surveys conducted in other states –However, most of those were conducted 10+ years ago –Surveys will be finalized after focus group results are reviewed

Study Methodology Survey administration –Testing Coordinators to receive survey notification –Survey to be available in late March to April

Study Results Stay tuned! –Hope to make the report publicly available on DPI testing website

Other Research Projects Trying out different item types Item location effects Auditing

Contact Information Nadine McBride Psychometrician Melinda Taylor Psychometrician Carrie Perkis Data Analyst