Rebecca Sleeper July 2011

Statistical Analysis of test taker performance on specific exam items Qualitative Evaluation of adherence to optimal item-writing principles Review for mis-keyed answers, ambiguous language, more than one right answer, degree of agreement with course materials presented

Difficulty score Number of students (or proportion of students – usually expressed as %) answering correctly ▪ Inverse relationship - the higher the diff score the “easier” the question – more students got it right Point biserial Correlation between ▪ Dichotomous or binary variable ▪ “Right” (1) or “wrong” (0) answer on a given question ▪ Continuous variable ▪ Student’s performance on the test as a whole (their % score)

Value expressed within boundaries of -1 to +1 – Also sometimes expressed on a 100-pt scale What does the correlation tell you? – Assumptions: Students with high exam % should have a higher probability of answering any given item correctly Students with low exam % should have a lower probability of answering any given item correctly – A (+) point biserial suggests these relationships are true The higher the number the stronger the correlation – A (–) point biserial suggests the opposite That students with low exam % were answering an item correctly more frequently than students with a high exam %

Summary Statistics % answering correctly Item N Diff Upper Lower Discrim Mean Median Standard (% Corr) 25% 25% Deviation BS-P % 0.00% 41.14% BS-P2-5-2b % % 49.54% BS-P % 0.00% 49.86% BS-P4-1-9b % 0.00% 48.79% BS-P3-1-1b % % 49.66% BS-P % % 46.65% BS-P2-10-1b % % 49.66% BS-P % 0.00% 23.00% BS-P4-1-16b % % 50.05% BS-P % 0.00% 41.14% BS-P % % 29.30% BS-P % % 47.86% BS-P4-1-11b % % 40.21% BS-P3-1-2b % 0.00% 48.69% BS-P % % 48.47% BS-P % % 44.95% BS-P % % 41.73% BS-P % % 23.82%

Diff score – In general, majority of test takers answer correctly (>50%) Although lower diff scores with a high point biserial may suggest a good question, just a difficult one – you may want some of these on your test instrument – Diff score at or near 0 suggests something’s wrong! Point biserial – Should be (+) – Opinion varies, target values >0.2 or >0.3 are commonly cited Most sources suggest 0.30 or higher (>30%) is ideal – Point biserials at or near 0 suggests a “give away” Questions that do not meet these criteria are targets for review

Diff score (%) Point biserial >0.30 Point biserial Point biserial 0.0 – Negative Point biserial 0 – 30ReviewReview / Toss Toss 31 – 50Keep (but is a “toughie”) ReviewReview / Toss Toss 51 – 80Keep Keep/ Review Review? 81 – 100Keep Keep (but is a “gimmie”) Review?

