Rebecca Sleeper July 2011

 Statistical  Analysis of test taker performance on specific exam items  Qualitative  Evaluation of adherence to optimal item-writing principles  Review for mis-keyed answers, ambiguous language, more than one right answer, degree of agreement with course materials presented

 Difficulty score  Number of students (or proportion of students – usually expressed as %) answering correctly ▪ Inverse relationship - the higher the diff score the “easier” the question – more students got it right  Point biserial  Correlation between ▪ Dichotomous or binary variable ▪ “Right” (1) or “wrong” (0) answer on a given question ▪ Continuous variable ▪ Student’s performance on the test as a whole (their % score)

Value expressed within boundaries of -1 to +1 – Also sometimes expressed on a 100-pt scale What does the correlation tell you? – Assumptions: Students with high exam % should have a higher probability of answering any given item correctly Students with low exam % should have a lower probability of answering any given item correctly – A (+) point biserial suggests these relationships are true The higher the number the stronger the correlation – A (–) point biserial suggests the opposite That students with low exam % were answering an item correctly more frequently than students with a high exam %

Summary Statistics % answering correctly Item N Diff Upper Lower Discrim Mean Median Standard (% Corr) 25% 25% Deviation BS-P4-1-12 233 21.46 42.37 11.86 30.51 21.46% 0.00% 41.14% BS-P2-5-2b 233 57.51 69.49 40.68 28.81 57.51% 100.00% 49.54% BS-P3-2-4 233 45.06 49.15 28.81 20.34 45.06% 0.00% 49.86% BS-P4-1-9b 233 38.63 28.81 45.76 -16.95 38.63% 0.00% 48.79% BS-P3-1-1b 233 56.65 76.27 30.51 45.76 56.65% 100.00% 49.66% BS-P4-1-2 233 68.24 86.44 57.63 28.81 68.24% 100.00% 46.65% BS-P2-10-1b 233 56.65 79.66 30.51 49.15 56.65% 100.00% 49.66% BS-P2-5-1 233 5.58 6.78 11.86 -5.08 5.58% 0.00% 23.00% BS-P4-1-16b 233 52.36 88.14 27.12 61.02 52.36% 100.00% 50.05% BS-P4-1-5 233 21.46 47.46 11.86 35.59 21.46% 0.00% 41.14% BS-P4-1-13 233 90.56 100.00 76.27 23.73 90.56% 100.00% 29.30% BS-P2-2-4 233 64.81 84.75 45.76 38.98 64.81% 100.00% 47.86% BS-P4-1-11b 233 79.83 98.31 61.02 37.29 79.83% 100.00% 40.21% BS-P3-1-2b 233 38.20 69.49 22.03 47.46 38.20% 0.00% 48.69% BS-P2-2-1 233 62.66 72.88 55.93 16.95 62.66% 100.00% 48.47% BS-P4-1-3 233 72.10 91.53 50.85 40.68 72.10% 100.00% 44.95% BS-P2-3-2 233 77.68 96.61 52.54 44.07 77.68% 100.00% 41.73% BS-P2-2-10 233 93.99 100.00 86.44 13.56 93.99% 100.00% 23.82%

Diff score – In general, majority of test takers answer correctly (>50%) Although lower diff scores with a high point biserial may suggest a good question, just a difficult one – you may want some of these on your test instrument – Diff score at or near 0 suggests something’s wrong! Point biserial – Should be (+) – Opinion varies, target values >0.2 or >0.3 are commonly cited Most sources suggest 0.30 or higher (>30%) is ideal – Point biserials at or near 0 suggests a “give away” Questions that do not meet these criteria are targets for review

Diff score (%) Point biserial >0.30 Point biserial 0.15 - 0.30 Point biserial 0.0 – 0.149 Negative Point biserial 0 – 30ReviewReview / Toss Toss 31 – 50Keep (but is a “toughie”) ReviewReview / Toss Toss 51 – 80Keep Keep/ Review Review? 81 – 100Keep Keep (but is a “gimmie”) Review?

