Review of AERA/APA/NCME Test Standards Revision

Slides:

Advertisements

Similar presentations

An Introduction for the School Community

Advertisements

AP STUDY SESSION 2.

Chapter 7 System Models.

Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.

Assurance Services Independent professional services that “improve the quality of information, or its context, for decision makers” Assurance service encompass.

Copyright © 2011, Elsevier Inc. All rights reserved. Chapter 6 Author: Julia Richards and R. Scott Hawley.

Author: Julia Richards and R. Scott Hawley

Open Hearing on Revising the Standards for Educational and Psychological Testing National Council on Measurement in Education March 25, 2008 New York,

1 Balloting/Handling Negative Votes September 22 nd and 24 th, 2009 ASTM Virtual Training Session Christine DeJong Joe Koury.

How to Choose and Use Accommodations for Students with Disabilities: Professional Development for IEP Teams Dan Farley PED – Special Education Bureau Transition.

Knowledge Dietary Managers Association 1 DMA Certification Exam Blueprint and Curriculum Development.

Quality Education Investment Act of 2006 (QEIA) 1 Quality Education Investment Act (QEIA) of 2006 County Superintendents Oversight and Technical Assistance.

Objectives To introduce software project management and to describe its distinctive characteristics To discuss project planning and the planning process.

National Accessible Reading Assessment Projects Goals of Project NARAP Collaboration General Advisory Committee Project Details (ETS and PARA) Plans for.

1 RA I Sub-Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Casablanca, Morocco, 20 – 22 December 2005 Status of observing programmes in RA I.

1 DPAS II Process and Procedures for Teachers Developed by: Delaware Department of Education.

Create an Application Title 1A - Adult Chapter 3.

Custom Statutory Programs Chapter 3. Customary Statutory Programs and Titles 3-2 Objectives Add Local Statutory Programs Create Customer Application For.

Career and College Readiness Kentucky Core Academic Standards Characteristics of Highly Effective Teaching and Learning Assessment Literacy MODULE 1.

Assessment Literacy Kentucky Core Academic Standards Characteristics of Highly Effective Teaching and Learning Career and College Readiness MODULE 1.

SBA to GLE: The Road Les Morse, Director Assessment & Accountability Alaska Department of Education & Early Development No Child Left Behind Winter Conference.

1 Career Pathways for All Students PreK-14 2 Compiled by Sue Updegraff Keystone AEA Information from –Iowa Career Pathways –Iowa School-to-Work –Iowa.

Using outcomes data for program improvement Kathy Hebbeler and Cornelia Taylor Early Childhood Outcome Center, SRI International.

1 Click here to End Presentation Software: Installation and Updates Internet Download CD release NACIS Updates.

Part Three Markets and Consumer Behavior

1 Implementing Internet Web Sites in Counseling and Career Development James P. Sampson, Jr. Florida State University Copyright 2003 by James P. Sampson,

Chapter 5 – Enterprise Analysis

Pennsylvania Value-Added Assessment System (PVAAS) High Growth, High Achieving Schools: Is It Possible? Fall, 2011 PVAAS Webinar.

1 Quality Indicators for Device Demonstrations April 21, 2009 Lisa Kosh Diana Carl.

Fairness in Testing: Introduction Suzanne Lane University of Pittsburgh Member, Management Committee for the JC on Revision of the 1999 Testing Standards.

PP Test Review Sections 6-1 to 6-6

Bright Futures Guidelines Priorities and Screening Tables

EIS Bridge Tool and Staging Tables September 1, 2009 Instructor: Way Poteat Slide: 1.

©2003 Prentice Hall Business Publishing, Auditing and Assurance Services 9/e, Arens/Elder/Beasley The Demand for Audit and Assurance Services Chapter.

Bellwork Do the following problem on a ½ sheet of paper and turn in.

Mississippi Special Education Advisory Panel Annual Report to the State Board of Education July 2009.

Improving Practitioner Assessment Participation Decisions for English Language Learners with Disabilities Laurene Christensen, Ph.D. Linda Goldstone, M.S.

NYC DOE – Office of Teacher Effectiveness A

Copyright © 2012, Elsevier Inc. All rights Reserved. 1 Chapter 7 Modeling Structure with Blocks.

Promoting Regulatory Excellence Self Assessment & Physiotherapy: the Ontario Model Jan Robinson, Registrar & CEO, College of Physiotherapists of Ontario.

1 RA III - Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Buenos Aires, Argentina, 25 – 27 October 2006 Status of observing programmes in RA.

Basel-ICU-Journal Challenge18/20/ Basel-ICU-Journal Challenge8/20/2014.

The Rubric Reality Cobb Keys Classroom Teacher Evaluation System.

AU 350 SAS 111 Audit Sampling C Delano Gray June 14, 2008.

RTI Implementer Webinar Series: Establishing a Screening Process

Section 404 Audits of Internal Control and Control Risk

Analyzing Genes and Genomes

McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 12 View Design and Integration.

Essential Cell Biology

Audit of the Sales and Collection Cycle

1 Phase III: Planning Action Developing Improvement Plans.

Chapter Thirteen The One-Way Analysis of Variance.

Chapter 8 Estimation Understandable Statistics Ninth Edition

©2006 Prentice Hall Business Publishing, Auditing 11/e, Arens/Beasley/Elder Audit Sampling for Tests of Controls and Substantive Tests of Transactions.

PSSA Preparation.

Essential Cell Biology

Immunobiology: The Immune System in Health & Disease Sixth Edition

Overall Audit Plan and Audit Program

Chapter 13 Web Page Design Studio

Energy Generation in Mitochondria and Chlorplasts

Data, Now What? Skills for Analyzing and Interpreting Data

Update on the Revisions to the Standards for Educational and Psychological Testing: Overview 2010 Annual Meeting of the NCME Denver, Colorado May 1, 2010,

Revision of the Standards for Educational and Psychological Testing: Overview Society for Industrial and Organizational Psychology 25 th Annual Conference,

Testing Standards Chris Gruber, Barbara Plake & Wayne Camara.

University of North Carolina Wilmington

Presentation transcript:

Review of AERA/APA/NCME Test Standards Revision Barbara S. Plake University of Nebraska-Lincoln Co-Chair, Committee for Revision of Test Standards

Joint Committee Members Lauress Wise, Co Chair Barbara Plake, Co Chair Linda Cook, ETS Fritz Drasgow, University of Illinois Brian Gong, NCIEA Laura Hamilton, Rand Corporation Jo-Ida Hansen, University on MN Joan Herman, UCLA Michael Kane, Bar Examiners

Joint Committee Members Michael Kolen, University of Iowa Antonio Puente, UNC-Wilmington Paul Sackett, University of MN Nancy Tippins, Valtera Corporation Walter (Denny) Way, Pearson Frank Worrell, Univ of CA- Berkeley

Scope of Revision Based on comments each organization received from invitation to comment Summarized by the Management Committee in consultation with the Co-Chairs Wayne Camara, Chair, APA Suzanne Lane, AERA David Frisbie, NCME

Four Substantive Areas for Revisions Technology Accountability Workplace Access Plus attention to format issues

Theme Teams Working teams Cross team collaborations Chapter Leaders Focusing of bringing into chapters content related to themes in coherent and meaningful ways

Presentation: Four Substantive Areas Access – Linda Cook Accountability – Brian Gong Technology – Denny Way Workplace – Laurie Wise

Format Issues Organization of Chapters Consideration of ways to identify of “Priority Standards” More parallelism between chapter Tone Complexity Technical language

Timeline First meeting January, 2009 Three year process for completing text of revision Open comment/Organization reviews Projected publication Summer, 2012

Revising our Test Standards: Access for All Examinee Populations Presentation to the 2009 Annual Meeting of the American Educational Research Association San Diego, CA Linda Cook, ETS

Overview Standards related to Access appear throughout many of the chapters but are concentrated in Chapter 9: Testing Individuals of Diverse Linguistic Backgrounds Chapter 10: Testing Individuals with Disabilities Comments on Access were received by the management committee and summarized for the committee charge

Elements of the Charge Five of the elements of the charge focused on accommodations/modifications Impact/differentiation of accommodation and modification Appropriateness for ELL and EWD Appropriateness for variety of groups, e.g., pre-K, older populations Flagging Comparability/validity One element focused on adequacy and comparability of translations One element focused on Universal Design

Key Access Issues Included in our Charge - 1 Impact/differentiation of accommodations/modifications Appropriate ways to determine or establish the impact of accommodations/modifications on inferences, interpretations, uses of scores How do you differentiate clearly between what is an accommodation and what is a modification?

Key Access Issues Included in our Charge - 2 Appropriateness of accommodations for English-language learners and examinees with disabilities Selecting the appropriate accommodation for the individual Who should select the accommodation? What evidence should the selection be based on? Administering the appropriate accommodation What evidence is available to determine impact on test scores, given purpose of the test? how effective is the accommodation? Alternative assessments/modified achievement standards

Key Access Issues Included in our Charge - 3 Appropriateness of accommodations for a wider variety of groups Pre-K Older populations Number of older adults with cognitive impairments is rising Tested to determine mental status changes There are many complexities associated with testing this population Combined effects of medical problems, medication side effects, multiple sensory deficits, testing environment

Key Access Issues Included in our Charge - 4 Flagging Current treatment needs to be updated to reflect changes in practice since 1999 standards Most testing organizations no longer flag Decisions about flagging should be based on empirical evidence

Key Access Issues Included in our Charge - 5 Comparability and validity of inferences made based on scores from accommodated or modified tests Foundational issues such as comparability and validity need to be addressed in foundational chapters If sample sizes do not support analyses such as DIF, other evidence of validity should be pursued

Key Access Issues Included in our Charge - 6 Adequacy and comparability of translations (language to language and language to symbol, e.g., Braille) Evidence needed to demonstrate adequacy of translation and comparability of scores from translated tests Fluency, rather than primary language should be used to describe target population for a test Quality of translation/adaptation needs to be emphasized Interaction of language proficiency and construct needs to be considered

Key Access Issues Included in our Charge - 7 Universal Design 1999 Standards focus too much on accommodations and modifications and not enough on building accessibility features into design and development process

Revising our Test Standards: Issues for Accountability Presentation to the 2009 Annual Meeting of the American Educational Research Association San Diego, CA Brian Gong, Center for Assessment

Overview There has been a dramatic expansion of the use of tests for various forms of accountability and other uses related to educational policy-setting. The Joint Committee has been charged with considering how these uses in accountability should impact revisions to the Standards As with the other themes, comments on the standards that related to accountability were compiled by the Management Committee and summarized in their charge to the Joint Committee

Overview Standards related to accountability currently appear throughout; accountability also is especially relevant to Chapter 13 (Educational Testing and Assessment) and Chapter 15 (Testing in Program Evaluation and Public Policy) Under No Child Left Behind, there has been a dramatic increase in the use of tests for accountability. In such cases, test results have important consequences for third parties such as school administrators and teachers, although not always for the examinees themselves. Federal peer review procedures have required assurances of reliability and validity that often go beyond requirements of the current Test Standards. Attention to the overall technical quality of tests and score interpretation is required. High school tests are used as a graduation requirement and there have been questions about how the current Standards should be interpreted in these cases. In general, the validity and reliability of individual and aggregated scores used for accountability purposes need to be addressed.

Key Accountability Topics Included in our Charge Validity and reliability requirements Issues with scores, scaling, and equating Policy and practice Formative and interim assessments

1. Validity and Reliability Requirements Use of a single test (whether or not scores resulting from retesting or repeat testing are sufficient for using more than one score for high stakes decisions) as the sole source of high stakes decisions (e.g., graduation, promotion). How test alignment studies should be documented and used to demonstrate the validity of score interpretations regarding mastery of required content standards.

1. Validity, Reliability, and Reporting Requirements - continued Provide additional guidance on score accuracy, especially when used to classify individuals or groups into performance regions or other bands on a score scale. Validity and reliability requirements for reporting individual or aggregate performance on subscales (skills or diagnostics) and for instructing users in appropriate interpretations of such scores or data (e.g., as they impact between or within student and school comparisons, validity considerations in subscore interpretation). Incorporating error estimates and interpretive guidance in score reports, including subscores and diagnostic reporting for individuals and groups.

2. Issues with Scores, Scaling, and Equating Growth modeling, gain scores, and other methods of estimating aggregated performance or growth based on individual or school/district performance and characteristics. Issues or requirements when linking assessments (e.g., concordances, linkages and equating)

3. Policy and Practice How to balance privacy concerns for individual examinees, teachers, and administrators while meeting information needs for policy-makers. Issues related to the appropriate role of practice and test preparation, especially in contrast to admissions testing or credentialing.

4. Addressing formative and interim assessments Distinguishing among commercial formative and benchmark assessments (as well as item banks), their appropriate uses, and validation evidence required in interpreting scores from them.

Revising our Test Standards: Technological Advances Presentation to the 2009 Annual Meeting of the American Educational Research Association San Diego, CA Denny Way, Pearson

Overview Technological advances are changing the way tests are delivered, scored, interpreted and in some cases, the nature of the tests themselves The Joint Committee has been charged with considering how technological advances should impact revisions to the Standards As with the other themes, comments on the standards that related to technology were compiled by the Management Committee and summarized in their charge to the Joint Committee

Key Technology Issues Included in our Charge Reliability & validity of innovative item formats Validity issues associated with the use of: Automated scoring algorithms Automated score reports and interpretations Security issues for tests delivered over the internet Issues with web-accessible data, including data warehousing

Resources for Consideration Guidelines for Computer-Based Testing, Copyright 2002 Association of Test Publishers (ATP) International Guidelines on Computer-Based and Internet Delivered Testing, Copyright 2005 International Test Commission (ITC)

Reliability & Validity of Innovative Item Formats What special issues exist for innovative items with respect to access and elimination of bias against particular groups? How might the standards reflect these issues? What steps should the standards suggest with regards to “usability” of innovative items? What issues will emerge over the next five years related to innovative items/test formats that need to be addressed by the standards?

Automated Scoring Algorithms What level of documentation/disclosure is appropriate and tolerable for automated scoring developers/vendors? What sorts of evidence seem most important for demonstrating the validity and “reliability” of automated scoring systems? What issues will emerge over the next five years related to automated scoring systems that need to be addressed by the standards?

Automated Score Reports and Interpretation Use of computer for score interpretation “Actionable” reports (e.g., routing students and teachers to instructional materials and lesson plans based on test results)

Security issues for tests delivered over the internet Two aspects of this topic are of concern: protecting privacy and threats to validity due to breach of security. Protecting examinee privacy Considerations likely to affect standards related to test administration and responsibilities of test users

Web-Accessible Data, including Data Warehousing Applicability of general technology standards? Security Interoperability Revision to commentary vs. drafting additional standards

Revising our Test Standards: Issues for Work-Place Testing Presentation to the 2009 Annual Meeting of the American Educational Research Association San Diego, CA Laurie Wise, HumRRO

Overview Standards for testing in the work place are currently covered in Chapter 14 (one of the testing application chapters) Work-place testing includes employment testing as well as licensure, certification, and promotion testing. Comments on standards related to work place testing were received by the Management Committee and summarized in their charge to the Joint Committee.

Key Work-Place Testing Issues Included in our Charge Validity and reliability requirements for certification, licensure, and promotion tests. Issues when tests are administered only to small populations of job incumbents. Requirements for tests for new, innovative job positions that do not have incumbents or job history to provide validity evidence. Assuring access to licensure, certification, and promotion tests for examinees with disabilities that may limit participation in regular testing sessions? Differential requirements for certification and licensure and employment tests.

1. Validity and Reliability Requirements Some specific issues: Documenting and communicating the validity and reliability of pass-fail decisions in addition to the underlying scores How cut-offs are determined How validity and reliability information is communicated to relevant stakeholders

2. Issues with Small Examinee Populations Including: Alternatives to statistical tools for item screening Assuring fairness Assuring technical accuracy Alternatives to empirical validity evidence Maintaining comparability of scores from different test forms

3. Requirements for New Jobs Issues include: Identifying test content Establishing passing scores Assessing reliability Demonstrating validity

4. Assuring Access to Employment Testing See also separate presentation on fairness Issues include: Determining appropriate versus inappropriate accommodations Relating testing accommodations to accommodations available in the work place

5. Certification and Licensure versus Employment Testing Currently, two sections in the same chapter Examples of relevant issues: Differences in how test content is identified and validated Differences in test score use Who oversees testing: Private company versus professional board/organization