Presentation on theme: "Item Analysis: Improving Multiple Choice Tests Crystal Ramsay September 27, 2011 Schreyer Institute for Teaching."— Presentation transcript:
Item Analysis: Improving Multiple Choice Tests Crystal Ramsay September 27, 2011 Schreyer Institute for Teaching Excellence
This workshop is designed to help you do three things: To interpret statistical indices provided by the universitys Scanning Operations To differentiate between well-performing items and poor-performing items To make decisions about poor performing items
We give tests for 4 primary reasons. To find out if students learned what we intended To separate those who learned from those who didnt To increase learning and motivation To gather information for adapting or improving instruction
The rounded filling of an internal angle between two surfaces of a plastic molding is known as the A.rib. B.fillet. C.chamfer. D.Gusset plate. Stem Distracters Key Options Multiple choice items are comprised of 4 basic components.
An item analysis focuses on 4 major pieces of information provided in the test score report. Test Score Reliability Item Difficulty Item Discrimination Distracter information
Test score reliability is an index of the likelihood that scores would remain consistent over time if the same test was administered repeatedly to the same learners. Reliability coefficients range from.00 to Now look at the test score reliability from your exam. Ideal score reliabilities are >.80. Higher reliabilities = less measurement error.
Item Difficulty is the percentage of students who answered an item correctly. RESPONSE TABLE - FORM AITEM NO. OMIT A B C D E KEY- % EFFECT % % % C A C Represented in the Response Table as KEY-% Ranges from 0% to 100%
Easier items have higher item difficulty values. More difficult items have lower item difficulty values. ITEM NO. RESPONSE TABLE –FORM A ITEM EFFECT OMITABCDEKEY -% %%% C A E ITEM NO. RESPONSE TABLE –FORM A ITEM EFFECT OMITABCDEKEY -% %%% D D D
Number of alternatives for each item What is an ideal item difficulty statistic depends on 2 factors. Your reason for asking the question
Sometimes we include very easy or very difficult items on purpose. Did I deliberately pose difficult items to challenge my students thinking? Did I deliberately pose easy items to test basic information or to boost students confidence?
Now look at the item difficulties from your exam. Which items were easier for your students? Which items were more difficult?
Item Discrimination is the degree to which students with high overall exam scores also got a particular item correct. RESPONSE TABLE - FORM AITEM NO. OMIT A B C D E KEY- % EFFECT % % % C A C Represented as Item Effect because it tells how well an item performed Ranges from to 1.00 and should be >.2
A well- performing item A poor- performing item ITEM NO. RESPONSE TABLE –FORM A ITEM EFFECT OMITABCDEKEY -% %%% E ITEM NO. RESPONSE TABLE –FORM A ITEM EFFECT OMITABCDEKEY -% %%% D 0.46
Item Difficulty Test heterogeneity Item characteristics What is an ideal item discrimination statistic depends on 3 factors.
Very easy or very difficult items will have poor ability to discriminate among students. Very easy or very difficult items may still be necessary to sample content taught. Yet… Item difficulty
A test that assesses many different topics will have a lower correlation with any one content-focused item. A heterogeneous item pool may still be necessary to sample content taught. Yet… Test heterogeneity
A poorly written item will have little ability to discriminate among students. There is no substitute for a well-written item or for testing what you teach! and… Item quality
Now look at the item effects from your exam. Which items on your exam performed well? Did any items perform poorly?
Distracter information can be analyzed to determine which distracters were effective and which ones were not. RESPONSE TABLE - FORM AITEM NO. OMIT A B C D E KEY- % EFFECT % % % C A C Now look at the distracter information for items from your exam. What can you conclude about them?
Whether to retain, revise, or eliminate items depends on item difficulty, item discrimination, distracter information, and your instruction. Distracters Ultimately, its a judgment call that you have to make. Instruction
What if I have a relatively short test or I give a test in a small class? I might not use the testing service for scoring. Is there a way I can understand how my items worked? Yes.
Item 1AB*CD Top 1/310 Bottom 1/31432 Item 2A*BCD Top 1/382 Bottom 1/373 Item 3ABC*D Top 1/3514 Bottom 1/3244 Item 4A*BCD Top 1/310 Bottom 1/391 From: Suskie, L. (2009). Assessing student learning: A common sense guide (2 nd ed.). San Francisco: Jossey-Bass. 1. Which item is the easiest? 2. Which item shows negative (very bad) discrimination? 3. Which item discriminates best between high and low scores? 4. In Item 2, which distracter is most effective? 5. In Item 3, which distracter must be changed?
Multiple course sections Student feedback Other item types Even after you consider reliability, difficulty, discrimination, and distracters, there are still a few other things to think about…
Resources For an excellent resource on item analysis: eport/itemanalysis.php eport/itemanalysis.php For a more extensive list of item-writing tips: Choice%20Item%20Writing%20Guidelines%20- %20Haladyna%20and%20Downing.pdf Choice%20Item%20Writing%20Guidelines%20- %20Haladyna%20and%20Downing.pdf c_tips.pdf c_tips.pdf For a discussion about writing higher-level multiple choice items: dford.pdf dford.pdf