Standard Setting Methods with High Stakes Assessments Barbara S. Plake Buros Center for Testing University of Nebraska.

Standard Setting Methods with High Stakes Assessments Barbara S. Plake Buros Center for Testing University of Nebraska

Setting Passing Scores F Essential for making high stakes decisions F Must ensure that qualified candidates pass F Must ensure that unqualified candidates fail F 70% correct is NOT the right answer! F “Standard Setting” -- setting the “standard” or “passing score”

Approaches F Empirically based –Regression –Contrasting groups/Borderline groups –Norm-based F Test Based –Judgmental –Test and candidate based

Empirically based methods F Need to know status of candidate (worthy of passing or not) F More likely in classroom settings F Not likely the case in licensure settings F Norm-based –Not tied to the KSAs needed to function effectively/safely in the profession –Capricious and arbitrary

Test Based F KSAs form basis for test content F Focus on target candidate –MCC –JQC

Assessment Tasks F Multiple choice questions –Good content coverage –Efficient scoring –Can measure higher order reasoning if well constructed

Constructed Response F More directly related to target skill? F Some differences by candidate F Time consuming to administer and score F Increases costs

Judgmental task F How will the minimally qualified candidate (MCC) perform on the tasks in the test? F Need qualified, well trained judges –Often experts (SMEs) –Need to modify SMEs perception to focus on entry level performance –Feedback

Decision Rules F Compensatory –Performance on total is what matters –Weaknesses in one area can be compensated by strengths in another –Higher reliability

Decision Rules F Conjunctive –Passing scores set on parts of the test –Candidates must pass all parts in order to pass the test –Sometimes candidates are allowed to “bank” passed parts

Test Based Methods F Multiple choice questions –Angoff Method –Yes/No Extension –Bookmark

Test Based Methods F Constructed Response –Analytical Judgment –Paper selection

Angoff “Method” F SMEs estimate the probability that a hypothetical, randomly selected MCC will be able to answer each question correctly. F Addition of SME’s estimates = SME’s passing score F Average across SMEs = recommended passing score F Range of probable values (SEE)

Angoff variations F Multiple rounds of ratings F Feedback in between –SME results –Candidate performance P-valuesP-values % passing% passing

Criticisms of Angoff Methods F Cognitively challenging F “Impossible task” F “Fatally flawed” NRC report F Research has shown that ratings are consistent across years and raters F Need strong training/discussion of KSAs of MCCs

Yes/No Variation F SMEs estimate whether or not the MCC will be able to get the item correctly (Y/N) –Response probability –More likely than not (.50) –Fairly certain (.67) F Add the Ys to get SME’s passing score F Average across SMEs = recommended passing score F Cutpoint +/- SEE (1 or 2)

Yes/No Variation F More popular with SMEs F Feedback not necessarily needed F Quicker to implement

Bookmark Method F Often used with IRT calibrated items but not necessary F Test questions order from easy to hard F Response probability F Insert bookmark between pages when the MCC probability of a correct response dips below response probability

Bookmark Method F Number of items preceding bookmark is SMEs passing score F Often little discussion on KSAs of MCC F Multiple small groups F Discussion between rounds F Multiple rounds; data usually isn’t shared until 2nd of 3rd rounds.

Bookmark Method F Results often shown graphically across rounds F Frequently convergence occurs after 1st round F Average across SMEs = recommended cutpoint F SEE formula; cutpoint +/- SEE (1 or 2)

Constructed response tasks F Extended Angoff F Analytical Judgment

Extended Angoff F SMEs estimate how many of the total points available for the task will be earned by the MCC. F Cutpoint is determined in a similar fashion to Angoff; sum points for SME, average across SMEs. F Range of probable values

Analytical Judgment F SMEs see prescored candidate responses (but scores aren’t revealed) F Task is to sort candidate responses into performance categories –Clearly passing –Passing –Not Passing

Analytical Judgment F Clearly passing set aside F Candidate responses in the Passing and Not Passing categories are ordered from lowest performance to highest. F Top responses in the Not Passing category are identified (usually 3) F Lowest responses in the Passing category are identified (usually 3)

Analytical Judgment F Average across these 6 papers is SME’s passing score F Feedback provided on SME passing scores F Round 2 F Cutpoint is average across SMEs passing scores F Range of probable values

Paper Selection F Exemplar candidate work is selected for each score point (typically 2/score point) F SMEs task is to pick the two papers that best represent the work of the MCC F Scores are not revealed to SMEs F Average of SMEs selected papers = SME’s passing score F Average across SMEs = cutpoint F Range of probable values

Who Makes the Final Decision? F Each approach yielded a cutpoint and a “range of probable values” F This information should be communicated to the policy makers for their final decision. F Standard setting methods only yield a range of consistent, defensible cutpoints F Final decision is a policy matter!

Providing Validity Evidence F What evidence is useful in supporting the results of the standard setting process? F This evidence should be gathered to have available in case of a legal challenge. F Responsibility of test developer to provide at least procedural validity evidence. F Collatoral evidence could be part of a long- term validity research program

Procedural Evidence F SMEs –Representative of profession –Qualifications –Confidentiality –Conflict of interest statements –Cannot teach preparation classes or sit for examination

Training F Did SMEs understand method? F Was sufficient time allotted to training? F Did the SMEs have a clear conceptualization of the MCC? F Did they understand the purpose of the standard setting procedure? F Do they understand that the final decision will be based on their work, but not dictated by it?

Practice F Was enough time devoted to practice? F Were the practice materials sufficiently similar to the operational materials? F Did the SMEs feel they had a reasonable opportunity to ask questions and receive clarifications F Did they understand the feedback information?

Operational F Was enough time devoted to their work (across rounds)? F How confident did the SMEs feels about their ratings (across rounds)? F How useful/influential was the feedback? F Did the facilities support their work?

Overall F Confidence that the method used will result in appropriate minimum passing score? F Was the workshop handled in a professional manner? F Was the workshop well organized? F Opportunity for comments

Main Point F Many methods, all aimed at provided a structured and reasoned approach to identifying –Cutpoint –Range of probable values –Procedural validity evidence

Match of Method to Assessment F Method selected should be appropriate for the assessment (MCQ, constructed response). F Logistically feasible F Published in peer-reviewed journals? F Should be replicable F Multiple methods? Multiple panels?

Purpose of Presentation F Provide an orientation to current standards setting methods F Provide background on the needed processes and procedures to conduct a professional (and legally defensible) standard setting workshop.

Thank you F I am honored to be asked to share my expertise in this area F I hope the presentation has been useful and meaningful F Best outcome for me is if it raised your awareness of methods and issues in standard setting.

Standard Setting Methods with High Stakes Assessments Barbara S. Plake Buros Center for Testing University of Nebraska.

Similar presentations

Presentation on theme: "Standard Setting Methods with High Stakes Assessments Barbara S. Plake Buros Center for Testing University of Nebraska."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Standard Setting Methods with High Stakes Assessments Barbara S. Plake Buros Center for Testing University of Nebraska.

Similar presentations

Presentation on theme: "Standard Setting Methods with High Stakes Assessments Barbara S. Plake Buros Center for Testing University of Nebraska."— Presentation transcript:

Similar presentations

About project

Feedback