SETTING & MAINTAINING EXAM STANDARDS

Slides:



Advertisements
Similar presentations
Knowledge Dietary Managers Association 1 PART II - DMA Certification Exam Blueprint and Exam Development-
Advertisements

Principles of Standard Setting
Item Analysis.
Test Development.
Copyright © 2012 Pearson Education, Inc. or its affiliate(s). All rights reserved
M AKING A PPROPRIATE P ASS- F AIL D ECISIONS D WIGHT H ARLEY, Ph.D. DIVISION OF STUDIES IN MEDICAL EDUCATION UNIVERSITY OF ALBERTA.
© McGraw-Hill Higher Education. All rights reserved. Chapter 3 Reliability and Objectivity.
Standardized Tests What They Measure How They Measure.
Item Analysis: A Crash Course Lou Ann Cooper, PhD Master Educator Fellowship Program January 10, 2008.
Standard setting and maintenance for Reformed GCSEs Robert Coe.
Methods of Standard Setting
Grading. Why do we grade? To communicate To tell students how they are doing To tell parents how students are doing To make students uneasy To wield power.
Medical school attendedPassing grade Dr JohnNorthsouth COM (NSCOM)80% Dr SmithEastwest COM (EWCOM)50% Which of these doctors would you like to treat you?
Standard Setting for Professional Certification Brian D. Bontempo Mountain Measurement, Inc. (503) ext 129.
Educational Measurement and School Accountability Directorate Better informed, better positioned, better outcomes National Assessment Program – Literacy.
Grading Scenarios.
Dr. Majed Wadi MBChB, MSc Med Edu
Statistics of EBO 2010 Examination EBO General Assembly Sunday June 21st, 2010 (Tallin, Estonia) Danny G.P. Mathysen MSc. Biomedical Sciences EBOD Assessment.
Chapter 4 Validity.
Item Analysis What makes a question good??? Answer options?
Item Analysis Ursula Waln, Director of Student Learning Assessment
SETTING & MAINTAINING EXAM STANDARDS Raja C. Bandaranayake.
Standard Setting Different names for the same thing Standard Passing Score Cut Score Cutoff Score Mastery Level Bench Mark.
Objective Exam Score Distribution. Item Difficulty Power Item
Control Charts for Variables
Item Analysis Prof. Trevor Gibbs. Item Analysis After you have set your assessment: How can you be sure that the test items are appropriate?—Not too easy.
Multiple Choice Test Item Analysis Facilitator: Sophia Scott.
Wastewater Treatment Plant Operator Exam Setting Performance Standards With The Modified Angoff Procedure.
Standardized Test Scores Common Representations for Parents and Students.
Determining the Size of
Item Analysis: Classical and Beyond SCROLLA Symposium Measurement Theory and Item Analysis Modified for EPE/EDP 711 by Kelly Bradley on January 8, 2013.
Formative and Summative Assessment
MEASUREMENT AND EVALUATION
Topic 4: Formal assessment
Office of Institutional Research, Planning and Assessment January 24, 2011 UNDERSTANDING THE DIAGNOSTIC GUIDE.
How Can Teacher Evaluation Be Connected to Student Achievement?
Item 1 Picabo came in at a speed of 100 mph on the downhill. Tommy, on a bad day, came in at the same speed. The average female speed on the downhill is.
1 Establishing A Passing Standard Paul D. Naylor, Ph.D. Psychometric Consultant.
Chapter 7 Item Analysis In constructing a new test (or shortening or lengthening an existing one), the final set of items is usually identified through.
Review and Validation of ISAT Performance Levels for 2006 and Beyond MetriTech, Inc. Champaign, IL MetriTech, Inc. Champaign, IL.
 Closing the loop: Providing test developers with performance level descriptors so standard setters can do their job Amanda A. Wolkowitz Alpine Testing.
How to Fail a Student Lisa M. Beardsley-Hardy, PhD, MPH, MBA Director of Education General Conference of Seventh-day Adventists.
Psy B07 Chapter 4Slide 1 SAMPLING DISTRIBUTIONS AND HYPOTHESIS TESTING.
Assessment in Education Patricia O’Sullivan Office of Educational Development UAMS.
Assessment Training Nebo School District. Assessment Literacy.
Cut Points ITE Section One n What are Cut Points?
Grading and Analysis Report For Clinical Portfolio 1.
Chapter 7 Utility. Utility Analysis What is a Utility Analysis? Some Practical Considerations –The pool of job applicants –The complexity of the job –The.
Your Scores PSAT/NMSQT scores are reported on a scale from 20 to 80. Average scores are near the midpoint (50) of the scale. For the 2000 PSAT/NMSQT test:
Assessment Analysis Using MarkIt CECIL Users Meeting 6 May 2004.
Item pocket method to allow response review and change in CAT Kyung T. Han
Item Analysis: Classical and Beyond SCROLLA Symposium Measurement Theory and Item Analysis Heriot Watt University 12th February 2003.
Dr. Dave Whitney California State University, Long Beach Me? Create a Selection Test?
TEST SCORES INTERPRETATION - is a process of assigning meaning and usefulness to the scores obtained from classroom test. - This is necessary because.
Dan Thompson Oklahoma State University Center for Health Science Evaluating Assessments: Utilizing ExamSoft’s item-analysis to better understand student.
Assessment Assessment is the collection, recording and analysis of data about students as they work over a period of time. This should include, teacher,
About the Milestone EOC Counts for 20% of your grade Part 1 – Monday, May 9 th, up to 70 minutes Part 2 – Tuesday, May 10th, up to 70 minutes Computer-based.
Approach to Written Questions in MRCPCH Exam ELBABA 2012.
Norm Referenced Your score can be compared with others 75 th Percentile Normed.
STANDARDIZED TESTING Understanding the Vocabulary.
© 2009 Pearson Prentice Hall, Salkind. Chapter 5 Measurement, Reliability and Validity.
CLEAR 2011 Annual Educational Conference
ARDHIAN SUSENO CHOIRUL RISA PRADANA P.
Data Analysis and Standard Setting
UMDNJ-New Jersey Medical School
Week 10 Slides.
Criterion Referencing Judges Who are the best predictors?
Tommy is coming in at the 16th percentile
Basic Statistics for Non-Mathematicians: What do statistics tell us
Phil Davies School of Computing University of Glamorgan
Presentation transcript:

SETTING & MAINTAINING EXAM STANDARDS Raja C. Bandaranayake

NORM- & CRITERION-REFERENCED STANDARDS Absolute Not related to peer performance Standard set prior to exam Referenced to a defined level of performance NORM-REFERENCED Relative Based on peer-performance Varies with each group Cut-off point not related to competence

NEDELSKY (1954) METHOD: Example Consider N judges and n MCQ items of 1 in 5 type Judge A identifies 2 options in item 1 as those which a minimally competent examinee should eliminate as incorrect. MPL for that item for Judge A [MPLA1] = 1/(5-2) = 1/3 Similarly, in item 2 he identifies 3 options, giving an MPLA2 = 1/(5-3) = 1/2 He repeats this process for each item. The exam MPL for Judge A [MPLA] = MPLA1 +MPLA2 + MPLA3 + ………….MPLAn Similarly, Judge B’s MPL [MPLB] is determined The MPL for the exam (= cut-off score) is: (MPLA + MPLB + MPLC +….MPLN) / N

ANGOFF (1971) METHOD Example N judges consider 100 minimally competent examinees taking an MCQ exam of n items. Judge A estimates that, of these examinees, 50 should answer item 1 correctly, 20 item 2 correctly, 70 item 3 correctly, and so on to item n. The MPL for Judge A [MPLA] = (0.5 + 0.2 + 0.7 + . xn) / n X 100 = (say) A%. Similarly, for Judges B, C, D, E, …..N, the MPLs would be B%, C%, D%, E% ……N%, respectively. The MPL (cut-off score) for the exam is: (A% + B% + C% + D% + E% +....N%) / N

EBEL (1972) METHOD Example Assume that Judge A assigns items in a 200-item MCQ test to the cells of a “relevance-by-difficulty” matrix, as follows. He then estimates the percentage of items in each cell of the matrix that a minimally competent examinee should be able to answer correctly (as indicated within the cell). Each cell also includes the products of these two values. EASY MEDIUM HARD ESSENTIAL 15 x 100% = 1500 25 x 80% =2000 10 x 60% = 600 IMPORTANT 20 x 80% = 1600 40 x 60% =2400 20 x 50% =1000 ACCEPTABLE 10 x 50% = 500 25 x 40% = 1000 5 x 10% = 50 QUESTIONABLE 10 x 30% = 300 15 x 20% = 300 5 x 0% = 0

EBEL (1972) METHOD - contd. Example The MPL for Judge A [MPLA] is then: (1500 + 1600 + 500 + 300 + 2000 + 1000 + 300 + 600 + 1000 + 50 + 0) / 200 = 56.25 % Similarly, the MPL for Judges B [MPLB], C [MPLc], D [MPLD] …..N [MPLN] are determined. The MPL for the exam (cut-off score) is: (MPLA+ MPLB+ MPLc+ MPLD + …..MPLN) / N

PROPOSED EBEL MODIFICATION. EASY. MEDIUM. HARD ESSENT. 6x 100% = 600 PROPOSED EBEL MODIFICATION EASY MEDIUM HARD ESSENT. 6x 100% = 600 12 x 80% = 960 7 x 50% = 350 IMPORT. 12 x 80% = 960 24 x 60% = 1440 19 x 40% = 760 ACCEPT. 5 x 60% = 300 12 x 50% = 600 3 x 10% = 30 MPL: =600 + 960 + 350 + 960 + 1440 + 760 + 300 + 600 + 30 =6000/100 = 60

HOFSTEE METHOD fmin fmax cmin cmax A B Failure Rate% Cut-off score(%) 10 15 20 35 40 45 50 fmin fmax cmin cmax A B

HOFSTEE METHOD Example A plot of cut-off scores for a given exam against resulting failure rates is given cmin = 40% cmax = 45% fmin = 10% fmax = 20% A = point representing cmin,fmax B = point representing cmax,fmin Line AB intersects the curve at a cut-off point of 42.5% Thus, operational cut-off score = 42.5%

CUT-OFF SCORE FOR 1 IN 5 MCQ [FRACS PART 1] Probability of guessing (=1 in 5) = 20% ‘Total ignorance’ score = 20% Maximum possible score =100% Effective range of scores = 20% to 100% Mid-point of this range = 60% Additional factor (as PG exam) = 5% Nominal cut-off score (60%+5%) = 65%

CUT-OFF SCORES: “MARKER QUESTIONS” 1. Comparison of exam scores Mean score in this exam: 56.7% Average exam mean score over last 4 years: 59.4% Thus mean score in this exam is: 2.7% lower Assuming this candidate group is of same standard as in last 4 yrs, this exam is: 2.7% harder

CUT-OFF SCORES: “MARKER QUESTIONS” - contd. 2. Comparison of “marker” scores Mean score in this exam on previously used questions (N=162): 62.5% Mean score on same questions when they were each last used: 60.5% Thus, when compared with previous candidates, this group of candidates, on these items, scored (62.5-60.5)% = 2.0% higher Thus this group of candidates is: 2.0% better than previous groups

CUT-OFF SCORES: “MARKER QUESTIONS” – contd. 3. Estimating examination difficulty Thus it is expected that their mean score in this exam would be: 2.0% higher But their mean score in this exam is: 2.7% lower Thus this exam is really: 4.7% harder

CUT-OFF SCORES: “MARKER QUESTIONS” –contd. 4. Determining cut-off score The cut-off level for an average exam is: 65.0% Thus the cut-off level for this exam should be (65 – 4.7)% = 60.3% Cut-off score = 60.3%

HOFSTEE CURVE Failure Rate% Cut-off score(%) 10 15 20 55 60 65 70