Presentation is loading. Please wait.

Presentation is loading. Please wait.

Update on the Revisions to the Standards for Educational and Psychological Testing: Overview 2010 Annual Meeting of the NCME Denver, Colorado May 1, 2010,

Similar presentations


Presentation on theme: "Update on the Revisions to the Standards for Educational and Psychological Testing: Overview 2010 Annual Meeting of the NCME Denver, Colorado May 1, 2010,"— Presentation transcript:

1 Update on the Revisions to the Standards for Educational and Psychological Testing: Overview 2010 Annual Meeting of the NCME Denver, Colorado May 1, 2010, 4:05 – 6:05 p.m. Michael Kolen University of Iowa

2 May 1, 2010 Update on Revisions to the Test Standards2 2 Joint Committee Members Lauress Wise, Co Chair, HumRRO Barbara Plake, Co Chair, University of Neb. Linda Cook, ETS Fritz Drasgow, University of Illinois Brian Gong, NCIEA Laura Hamilton, Rand Corporation Jo-Ida Hansen, University on MN Joan Herman, UCLA

3 May 1, 2010 Update on Revisions to the Test Standards3 3 Joint Committee Members Michael Kane, ETS Michael Kolen, University of Iowa Antonio Puente, UNC-Wilmington Paul Sackett, University of MN Nancy Tippins, Valtera Corporation Walter (Denny) Way, Pearson Frank Worrell, Univ of CA- Berkeley

4 May 1, 2010 Update on Revisions to the Test Standards4 4 Scope of the Revision Based on comments each organization received from invitation to comment Summarized by the Management Committee in consultation with the Co- Chairs Wayne Camara, Chair, APA Suzanne Lane, AERA David Frisbie, NCME

5 May 1, 2010 Update on Revisions to the Test Standards5 5 Five Identified Areas for the Revisions Access/Fairness Accountability Technology Workplace Format issues

6 May 1, 2010 Update on Revisions to the Test Standards6 6 Theme Teams Working teams Cross team collaborations Chapter Leaders Focusing of bringing into chapters content related to themes in coherent and meaningful ways

7 May 1, 2010 Update on Revisions to the Test Standards7 7 Presentation: Five Identified Areas & Discussant Fairness – Joan Herman Accountability – Laura Hamilton Technology – Denny Way Workplace – Laurie Wise Format and Publication Options - Barbara Plake Discussant - Steve Ferrara, NCME Liaison to JC

8 May 1, 2010 Update on Revisions to the Test Standards8 8 Timeline First meeting January, 2009 Three year process for completing text of revision Release of draft revision following December 2010 JC meeting Open comment/Organization reviews Projected publication Summer, 2012

9 Revision of the Standards for Educational and Psychological Testing: Fairness 2010 Annual Meeting of the NCME Denver, Colorado May 1, 2010, 4:05 – 6:05 p.m. Joan Herman CRESST/UCLA

10 May 1, 2010 Update on Revisions to the Test Standards10 Overview 1999 Approach to Fairness Committee Charge Revision Response

11 May 1, 2010 Update on Revisions to the Test Standards Approach Standards related to fairness appear throughout many chapters Concentrated attention in: Chapter 7: Fairness in Testing and Test Use Chapter 8: Rights and Responsibilities of Test Takers Chapter 9: Testing Individuals of Diverse Linguistic Backgrounds Chapter 10: Testing Individuals with Disabilities

12 May 1, 2010 Update on Revisions to the Test Standards12 Committee Charge Five elements of the charge focused on accommodations/modifications Impact/differentiation of accommodation and modification Appropriate selection/use for ELL and EWD Attention to other groups, e.g., pre-K, older populations Flagging Comparability/validity One element focused on adequacy and comparability of translations One element focused on Universal Design

13 May 1, 2010 Update on Revisions to the Test Standards13 Revision Response Fairness is fundamental to test validity: include as foundation chapter Fairness and access are inseparable Same principles of fairness and access apply to all individuals and regardless of specific subgroup From three chapters to a single chapter describe core principles and standards Examples drawn from ELs, EWD, and other groups (young children, aging adults adults, etc) Comments point to applications for specific groups Special standards retained where appropriate (e.g., test translations)

14 May 1, 2010 Update on Revisions to the Test Standards14 Overview to Fairness Chapter Section I: General Views of Fairness Section II: Threats to the Fair and Valid Interpretations of Test Scores Section III: Minimizing Construct Irrelevant Components Through the Use of Test Design and Testing Adaptations Section IV: The Standards

15 May 1, 2010 Update on Revisions to the Test Standards15 Four Clusters of Standards 1.Use test design, development administration and scoring procedures that minimize barriers to valid test interpretations for all individuals. 2.Conduct studies to examine the validity of test score inferences for the intended examinee population. 3.Provide appropriate accommodations to remove barriers to the accessibility of the construct measured by the assessment and to the valid interpretation of the assessment scores. 4.Guard against inappropriate interpretations, use, and/or unintended consequences of test results for individuals or subgroups.

16 Revision of the Standards for Educational and Psychological Testing: Accountability 2010 Annual Meeting of the NCME Denver, Colorado May 1, 2010, 4:05 – 6:05 p.m. Laura Hamilton RAND Corporation

17 May 1, 2010 Update on Revisions to the Test Standards17 Overview Use of tests for accountability has expanded Most notably in education but also in other areas such as behavioral health Facilitated by increasing availability of data and analysis tools Recent and impending federal and state initiatives will likely lead to further expansion Under NCLB, or new pay for performance programs, tests often have consequences for individuals other than the examinees Use of test scores in policy and program evaluations continues to be widespread Reinforced by groups that fund and evaluate research (e.g., IES, What Works Clearinghouse)

18 May 1, 2010 Update on Revisions to the Test Standards18 Organization of Accountability Material Chapter on policy uses of tests focuses on use of aggregate scores for accountability and policy Chapter on educational testing addresses student-level accountability (e.g., promotional gates, high school exit exams) and interim assessment Validity, reliability, and fairness standards in earlier chapters apply to accountability testing as well

19 May 1, 2010 Update on Revisions to the Test Standards19 Some Key Accountability Issues Included in Our Charge 1. Calculation of accountability indices using composite scores at level of institution or individual Institutional level (e.g., conjunctive and disjunctive rules for combining scores) Individual level (e.g., teacher value-added modeling) 2. Issues related to validity, reliability, and reporting of individual and aggregate scores 3. Test preparation 4. Interim assessments

20 May 1, 2010 Update on Revisions to the Test Standards20 1. Accountability Indices Most test-based accountability systems require calculation of indices using complex set of rules Advances in data systems and statistical methodology have led to more sophisticated indices to support causal inferences E.g., teacher and principal value-added measures Consequences attached to these measures are growing increasingly significant

21 May 1, 2010 Update on Revisions to the Test Standards21 2. Validity, Reliability, and Reporting Requirements Accountability indices should be subjected to validation related to intended purposes Error estimates should be incorporated into score reports, including those that provide subscores and diagnostic guidance for individuals or groups Reports should provide clear, detailed information on rules used to create aggregate scores or indices

22 May 1, 2010 Update on Revisions to the Test Standards22 2. Validity, Reliability, and Reporting Requirements, cont. Guidance should be provided for interpretation of scores from subgroups Describe exclusion rules, accommodations, and modifications Address error stemming from small subgroups Explain contribution of subgroup performance to accountability index Teachers and other users should be given assistance to ensure appropriate interpretation and use of information from tests

23 May 1, 2010 Update on Revisions to the Test Standards23 3. Test Preparation High-stakes testing raises concerns about inappropriate test preparation Users should take steps to reduce likelihood of test preparation that undermines validity Help administrators and teachers understand what kinds of preparation are appropriate and desirable Design tests and testing systems to limit likelihood of harmful test preparation Consequences of accountability policies should be monitored

24 May 1, 2010 Update on Revisions to the Test Standards24 4. Addressing Interim Assessments Interim assessments are common but take many different forms Some produced by commercial publishers, others home- grown Vary in the extent to which they provide formative feedback vs. benchmarking to end-of-year tests Need to determine which of these tests should be subjected to the Standards Requirements for validity and reliability depend in part on how scores are used If used for high-stakes decisions such as placement, evidence of validity for that purpose should be provided Systems that provide instructional guidance should include rationale and evidence to support it

25 Revision of the Standards for Educational and Psychological Testing: Technology 2010 Annual Meeting of the NCME Denver, Colorado May 1, 2010, 4:05 – 6:05 p.m. Denny Way Pearson

26 May 1, 2010 Update on Revisions to the Test Standards26 Overview Technological advances are changing the way tests are delivered, scored, interpreted and in some cases, the nature of the tests themselves The Joint Committee has been charged with considering how technological advances should impact revisions to the Standards As with the other themes, comments on the standards that related to technology were compiled by the Management Committee and summarized in their charge to the Joint Committee

27 May 1, 2010 Update on Revisions to the Test Standards27 Key Technology Issues Included in our Charge Reliability & validity of innovative item formats Validity issues associated with the use of: Automated scoring algorithms Automated score reports and interpretations Security issues for tests delivered over the internet Issues with web-accessible data, including data warehousing

28 May 1, 2010 Update on Revisions to the Test Standards28 Reliability & Validity of Innovative Item Formats What special issues exist for innovative items with respect to access and elimination of bias against particular groups? How might the standards reflect these issues? What steps should the standards suggest with regards to “usability” of innovative items? What issues will emerge over the next five years related to innovative items/test formats that need to be addressed by the standards?

29 May 1, 2010 Update on Revisions to the Test Standards29 Automated Scoring Algorithms What level of documentation/disclosure is appropriate and tolerable for automated scoring developers/vendors? What sorts of evidence seem most important for demonstrating the validity and “reliability” of automated scoring systems? What issues will emerge over the next five years related to automated scoring systems that need to be addressed by the standards?

30 May 1, 2010 Update on Revisions to the Test Standards30 Expert Panel Input To address issues related to innovative item formats and automated scoring algorithms, we convened a panel of experts from the field and solicited their advice Invited members made presentations on these topics and discussed associated issues with the joint standards committee

31 May 1, 2010 Update on Revisions to the Test Standards31 Highlights of Technology Panel Input Test development and simulations Rationale / validity argument Usability studies / field testing Security & Fairness Timed tasks & processing speed Innovative clinical assessments & faking (effort assessment)

32 May 1, 2010 Update on Revisions to the Test Standards32 Highlights of Technology Panel Input Disclosure of automated scoring algorithms: Differing viewpoints Disclose everything to great detail (use patents to protect proprietary IP) vs. provide sufficient documentation for other experts to confirm validity of process Possible compromise: expert review under conditions of nondisclosure Quality Assurance: Importance of “independent calibrations”

33 May 1, 2010 Update on Revisions to the Test Standards33 Automated Score Reports and Interpretation Use of computer for score interpretation “Actionable” reports (e.g., routing students and teachers to instructional materials and lesson plans based on test results) Documentation of rationale Supporting validity evidence

34 Revision of the Standards for Educational and Psychological Testing: Workplace Testing 2010 Annual Meeting of the NCME Denver, Colorado May 1, 2010, 4:05 – 6:05 p.m. Laurie Wise Human Resources Research Organization (HumRRO)

35 May 1, 2010 Update on Revisions to the Test Standards35 Overview Standards for testing in the work place are currently covered in Chapter 14 (one of the testing application chapters). Work-place testing includes employment testing as well as licensure, certification, and promotion testing. Comments on standards related to work place testing were received by the Management Committee and summarized in their charge to the Joint Committee. Comments suggested areas for extending or clarifying testing standards, but did not suggest major revisions existing standards.

36 May 1, 2010 Update on Revisions to the Test Standards36 Key Work-Place Testing Issues Included in Our Charge 1.Validity and reliability requirements for certification and licensure tests. 2.Issues when tests are administered only to small populations of job incumbents. 3.Requirements for tests for new, innovative job positions that do not have incumbents or job history to provide validity evidence. 4.Assuring access to licensure and certification tests for examinees with disabilities that may limit participation in regular testing sessions? 5.Differential requirements for certification and licensure and employment tests.

37 May 1, 2010 Update on Revisions to the Test Standards37 1. Validity and Reliability Requirements for Certification Some specific issues: Documenting and communicating the validity and reliability of pass-fail decisions in addition to the underlying scores How cut-offs are determined How validity and reliability information is communicated to relevant stakeholders A key change is the need for focus on pass-fail decisions

38 May 1, 2010 Update on Revisions to the Test Standards38 2. Issues with Small Examinee Populations Including: Alternatives to statistical tools for item screening Assuring fairness Assuring technical accuracy Alternatives to empirical validity evidence Maintaining comparability of scores from different test forms Key concern is the with appropriate use of expert judgment

39 May 1, 2010 Update on Revisions to the Test Standards39 3. Requirements for New Jobs Issues include: Identifying test content Establishing passing scores Assessing reliability Demonstrating validity Key here is also appropriate use of expert judgment

40 May 1, 2010 Update on Revisions to the Test Standards40 4. Assuring Access to Certification and Licensure Testing See also separate presentation on fairness Issues include: Determining appropriate versus inappropriate accommodations Relating testing accommodations to accommodations available in the work place

41 May 1, 2010 Update on Revisions to the Test Standards41 5. Certification and Licensure versus Employment Testing Currently, two sections in the same chapter Examples of relevant issues: Differences in how test content is identified Differences in validation strategies Differences in test score use Who oversees testing Goal is to increase coherence in approach to these two related uses of tests

42 Revision of the Standards for Educational and Psychological Testing: Format and Publication 2010 Annual Meeting of the NCME Denver, Colorado May 1, 2010, 4:05 – 6:05 p.m. Barbara Plake University of Nebraska-Lincoln

43 May 1, 2010 Update on Revisions to the Test Standards43 Format Issues Organization of Chapters Consideration of ways to identify of “Priority Standards” More parallelism between chapter Tone Complexity Technical language

44 May 1, 2010 Update on Revisions to the Test Standards44 Organization of Chapters 1999 Testing Standards Three sections Foundation: Validity, Reliability, Test Development, Scaling & Equating, Administration & Scoring, Documentation Fairness: Fairness, Test Takers Rights and Responsibilities, Disabilities, Linguistic Minorities Applications: Test Users, Psychological, Educational, Workplace, Policy

45 May 1, 2010 Update on Revisions to the Test Standards45 Revised Test Standards Possible Chapter Organization Section 1: Validity, Reliability, Fairness Section 2: Test Design and Development, Scaling & Equating, Test Administration & Scoring, Documentation, Test Takers, Test Users Section 3: Psychological, Educational, Workplace, Policy and Accountability

46 May 1, 2010 Update on Revisions to the Test Standards46 Possible Ways to Identify “Priority Standards” Clustering of Standards into thematic topics Over-arching Standards/ Guiding Principles Application Chapters Connection of standards to previous standards

47 May 1, 2010 Update on Revisions to the Test Standards47 More Parallelism Across Chapters Cross-team collaborations Content editor with psychometric expertise Structural continuity

48 May 1, 2010 Update on Revisions to the Test Standards48 Publication Options Management Committee responsibility Goal is for electronic access Pursuing options for Kindle, etc. Concerns about retaining integrity and financial support for future revision efforts


Download ppt "Update on the Revisions to the Standards for Educational and Psychological Testing: Overview 2010 Annual Meeting of the NCME Denver, Colorado May 1, 2010,"

Similar presentations


Ads by Google