Standard Setting Different names for the same thing Standard Passing Score Cut Score Cutoff Score Mastery Level Bench Mark.

Slides:



Advertisements
Similar presentations
Principles of Standard Setting
Advertisements

Developing a coding scheme for content analysis A how-to approach.
Copyright © 2012 Pearson Education, Inc. or its affiliate(s). All rights reserved
Spiros Papageorgiou University of Michigan
Psychometric Aspects of Linking Tests to the CEF Norman Verhelst National Institute for Educational Measurement (Cito) Arnhem – The Netherlands.
Issues of Reliability, Validity and Item Analysis in Classroom Assessment by Professor Stafford A. Griffith Jamaica Teachers Association Education Conference.
M AKING A PPROPRIATE P ASS- F AIL D ECISIONS D WIGHT H ARLEY, Ph.D. DIVISION OF STUDIES IN MEDICAL EDUCATION UNIVERSITY OF ALBERTA.
The Research Consumer Evaluates Measurement Reliability and Validity
1 COMM 301: Empirical Research in Communication Kwan M Lee Lect4_1.
© McGraw-Hill Higher Education. All rights reserved. Chapter 3 Reliability and Objectivity.
Reliability - The extent to which a test or instrument gives consistent measurement - The strength of the relation between observed scores and true scores.
What is a Good Test Validity: Does test measure what it is supposed to measure? Reliability: Are the results consistent? Objectivity: Can two or more.
Designing Scoring Rubrics. What is a Rubric? Guidelines by which a product is judged Guidelines by which a product is judged Explain the standards for.
Advanced Topics in Standard Setting. Methodology Implementation Validity of standard setting.
Chapter 4A Validity and Test Development. Basic Concepts of Validity Validity must be built into the test from the outset rather than being limited to.
© Cambridge International Examinations 2013 Component/Paper 1.
Methods of Standard Setting
Medical school attendedPassing grade Dr JohnNorthsouth COM (NSCOM)80% Dr SmithEastwest COM (EWCOM)50% Which of these doctors would you like to treat you?
Standard Setting for Professional Certification Brian D. Bontempo Mountain Measurement, Inc. (503) ext 129.
Setting Performance Standards Grades 5-7 NJ ASK NJDOE Riverside Publishing May 17, 2006.
RESEARCH METHODS Lecture 18
Chapter 4 Validity.
Test Validity: What it is, and why we care.
VALIDITY.
New Hampshire Enhanced Assessment Initiative: Technical Documentation for Alternate Assessments Standard Setting Inclusive Assessment Seminar Marianne.
Item PersonI1I2I3 A441 B 323 C 232 D 112 Item I1I2I3 A(h)110 B(h)110 C(l)011 D(l)000 Item Variance: Rank ordering of individuals. P*Q for dichotomous items.
SETTING & MAINTAINING EXAM STANDARDS Raja C. Bandaranayake.
Setting Alternate Achievement Standards Prepared by Sue Rigney U.S. Department of Education NCEO Teleconference March 21, 2005.
C R E S S T / U C L A Improving the Validity of Measures by Focusing on Learning Eva L. Baker CRESST National Conference: Research Goes to School Los Angeles,
Examing Rounding Rules in Angoff Type Standard Setting Methods Adam E. Wyse Mark D. Reckase.
Research Methods in MIS
Reliability, Validity, & Scaling
1 Development of Valid and Reliable Case Studies for Teaching, Diagnostic Reasoning, and Other Purposes Margaret Lunney, RN, PhD Professor College of.
Measurement in Exercise and Sport Psychology Research EPHE 348.
Ch 6 Validity of Instrument
Standard Setting for a Performance-Based Examination for Medical Licensure Sydney M. Smee Medical Council of Canada Presented at the 2005 CLEAR Annual.
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
McMillan Educational Research: Fundamentals for the Consumer, 6e © 2012 Pearson Education, Inc. All rights reserved. Educational Research: Fundamentals.
Quality in language assessment – guidelines and standards Waldek Martyniuk ECML Graz, Austria.
Standardization and Test Development Nisrin Alqatarneh MSc. Occupational therapy.
1 Establishing A Passing Standard Paul D. Naylor, Ph.D. Psychometric Consultant.
Classroom Assessments Checklists, Rating Scales, and Rubrics
CRT Dependability Consistency for criterion- referenced decisions.
Grading and Reporting Chapter 15
Assessment in Education Patricia O’Sullivan Office of Educational Development UAMS.
Student assessment AH Mehrparvar,MD Occupational Medicine department Yazd University of Medical Sciences.
1 Producing Your Assessment Question Mark “Software for creating and delivering assessments with powerful reports”  Copyright 2000 QuestionMark. All.
Reliability & Validity
Cut Points ITE Section One n What are Cut Points?
6. Evaluation of measuring tools: validity Psychometrics. 2012/13. Group A (English)
Measurement Validity.
Validity Validity: A generic term used to define the degree to which the test measures what it claims to measure.
1 The Good, the Bad, and the Ugly: Collecting and Reporting Quality Performance Data.
Assessment. Workshop Outline Testing and assessment Why assess? Types of tests Types of assessment Some assessment task types Backwash Qualities of a.
JS Mrunalini Lecturer RAKMHSU Data Collection Considerations: Validity, Reliability, Generalizability, and Ethics.
Reliability performance on language tests is also affected by factors other than communicative language ability. (1) test method facets They are systematic.
Setting Performance Standards EPSY 8225 Cizek, G.J., Bunch, M.B., & Koons, H. (2004). An NCME Instructional Module on Setting Performance Standards: Contemporary.
Measurement and Scaling Concepts
Classroom Assessments Checklists, Rating Scales, and Rubrics
Jean-Guy Blais Université de Montréal
Take-Home Message: Principles Unique to Alternate Assessments
Introduction to the Validation Phase
Introduction to the Validation Phase
Validity and Reliability
Classroom Assessments Checklists, Rating Scales, and Rubrics
Reliability & Validity
Week 3 Class Discussion.
پرسشنامه کارگاه.
RELATING NATIONAL EXTERNAL EXAMINATIONS IN SLOVENIA TO THE CEFR LEVELS
RESEARCH METHODS Lecture 18
Presentation transcript:

Standard Setting Different names for the same thing Standard Passing Score Cut Score Cutoff Score Mastery Level Bench Mark

Standards exist for as long as there are tests Canada: 50 China: 60 America: 70 France: 2/3

Examinee-Centered Methods: Borderline Group Method Contrasting Groups Method Problem: How to identify the examinees. Test-Centered Methods: Nedelsky Method (1954) Angoff Method (1971) IDEA Method (2004) Problem: Judgmental errors.

Test-Centered Methods: Conceptualization of examinee competency Judgmental item analysis Aggregating item difficulty estimates

Training of Judges: Conceptualization of minimum competency The use of a standard-setting method Intrajudge inconsistency Iterative process Documentation

What do experts say? "We have come to realize that there is no objectively correct way to set standards. But we have also come to realize that there is nothing wrong with using judgments appropriately." (Zieky, 1995, p.5) "Determination of a minimum acceptable performance always involves some rather arbitrary and not wholly satisfactory decisions." (Ebel, 1972, p.492)

"Researchers agree with Glass (1978) that standards are all arbitrary. But they reject the notion that being arbitrary, in the sense of being judgmental, is a fatal flaw." (Zieky, 1994, p28) "If competence is a continuous variable, there is clearly no point on the continuum that would separate students into the competent and the incompetent. A right answer does not exist, except perhaps in the minds of those providing the judgments." (Jaeger, 1989, p.492)

1. All standard-setting is judgmental 2. Standard-setting leads to errors of classification 3. Standard-setting is and will remain controversial 4. There is no purely absolute standard. 5. There is no one right method 6. Choosing judges is more important than choosing methods 7. Standard-setting is a process

Failed Standard-Setting Exercises Due to Legal Matters Tenured teachers can not be decertified. Contracted teachers can not be decertified. Candidates for becoming teachers can be decertified Due to Psychometric Matters Practice analysis failed to support job-relatedness of the test. Teachers’ concern about the objective of the test was not addressed. Items did not reflect judgment of content committee. Teachers were excluded from the standard-setting processes. Changing cutscore without justifications.

Defensible Standard Setting Steps Subject matter experts are asked to review test items. Sensitivity review to check biases against certain groups. Documentation of the standard setting processes Description of subject matter experts Selection criteria and procedures Standard-setting methods and justifications Training procedures Independent evidence that the cutscore is “reasonable.” Any indices of reliability, item analysis information, distractor analysis Intrajudge and interjudge consistency evidence, e.g., split half

Steps and Procedures in Developing an Achievement Test 1. Define domain content Intelligence tests and theories,20% Personality tests,20% Item characteristics,10% Reliability,20% Validity,15% Test development,15% Table of specifications HighLowTotal Intelligence tests and theories5%15%20% Personality tests,5%15%20% Item characteristics,5% 10% Reliability,15%5%20% Validity,10%5%15% Test development,5%10%15%

3. Item Analysis Item variance Item difficulty Item discrimination 4. Setting Standards 5. Test reliability Test-retest Parallel form Split half-internal consistency 6. Test Validity various validity issues 2. Write Items Normally done by subject teachers.

"Anyone who expects to discover the "real" passing score by any of these approaches, or any other approach, is doomed to disappointment, for a "real" passing score does not exist to be discovered. All any examining authority that must set passing scores can hope for, and all any of their examinees can ask, is that the basis for defining the passing score be defined clearly, and that the definition be as rational as possible." (Ebel, 1972, p.496) "At a minimum, standard-setting procedures should include a balancing of absolute judgments and direct attention to passing rates." (Shepard, 1980, p.463)

Uses of Standards Exhortation To exhort, encourage or urge the students, schools, and public to exert more or different kinds of effort to achieve established standards of performance. Exemplification of Goals To provide clear specifications of the achievement levels that students are expected to attain. Accountability In the U.S., schools not students can obtain rewards or sanctions depending on the progress achieved in meeting performance standards. Certification of Achievement and Mastery

"The thing that hath been, it is that which shall be; and that which is done is that which shall be done: and there is no new thing under the sun." Ecclesiastes (1:9).