Dan Thompson Oklahoma State University Center for Health Science Evaluating Assessments: Utilizing ExamSoft’s item-analysis to better understand student.

Slides:

Advertisements

Similar presentations

An Introduction to Test Construction

Advertisements

FACULTY DEVELOPMENT PROFESSIONAL SERIES OFFICE OF MEDICAL EDUCATION TULANE UNIVERSITY SCHOOL OF MEDICINE Using Statistics to Evaluate Multiple Choice.

MCR Michael C. Rodriguez Research Methodology Department of Educational Psychology.

Rebecca Sleeper July  Statistical  Analysis of test taker performance on specific exam items  Qualitative  Evaluation of adherence to optimal.

Reliability and Validity checks S-005. Checking on reliability of the data we collect  Compare over time (test-retest)  Item analysis  Internal consistency.

Reliability Definition: The stability or consistency of a test. Assumption: True score = obtained score +/- error Domain Sampling Model Item Domain Test.

Using Test Item Analysis to Improve Students’ Assessment

Using Multiple Choice Tests for Assessment Purposes: Designing Multiple Choice Tests to Reflect and Foster Learning Outcomes Terri Flateby, Ph.D.

Item Analysis: A Crash Course Lou Ann Cooper, PhD Master Educator Fellowship Program January 10, 2008.

Constructing Exam Questions Dan Thompson & Brandy Close OSU-CHS Educational Development-Clinical Education.

Test Construction Processes 1- Determining the function and the form 2- Planning( Content: table of specification) 3- Preparing( Knowledge and experience)

Item Analysis What makes a question good??? Answer options?

Lesson Seven Item Analysis. Contents Item Analysis Item Analysis Item difficulty (item facility) Item difficulty (item facility) Item difficulty Item.

© 2008 McGraw-Hill Higher Education. All rights reserved. CHAPTER 16 Classroom Assessment.

Item Analysis Prof. Trevor Gibbs. Item Analysis After you have set your assessment: How can you be sure that the test items are appropriate?—Not too easy.

Lesson Nine Item Analysis.

Multiple Choice Test Item Analysis Facilitator: Sophia Scott.

ANALYZING AND USING TEST ITEM DATA

Office of Institutional Research, Planning and Assessment January 24, 2011 UNDERSTANDING THE DIAGNOSTIC GUIDE.

Designing and evaluating good multiple choice items Jack B. Monpas-Huber, Ph.D. Director of Assessment & Student Information.

Part #3 © 2014 Rollant Concepts, Inc.2 Assembling a Test #

LECTURE 06B BEGINS HERE THIS IS WHERE MATERIAL FOR EXAM 3 BEGINS.

Induction to assessing student learning Mr. Howard Sou Session 2 August 2014 Federation for Self-financing Tertiary Education 1.

Test item analysis: When are statistics a good thing? Andrew Martin Purdue Pesticide Programs.

Field Test Analysis Report: SAS Macro and Item/Distractor/DIF Analyses

The Genetics Concept Assessment: a new concept inventory for genetics Michelle K. Smith, William B. Wood, and Jennifer K. Knight Science Education Initiative.

Chapter 7 Item Analysis In constructing a new test (or shortening or lengthening an existing one), the final set of items is usually identified through.

Dr. Majed Wadi MBChB, MSc Med Edu. Objectives To discuss the concept of vetting process To describe the findings of literature review regarding this process.

Multiple Choice Question Design Karen Brooks & Barbara Tischler Hastie.

Techniques to improve test items and instruction

Group 2: 1. Miss. Duong Sochivy 2. Miss. Im Samphy 3. Miss. Lay Sreyleap 4. Miss. Seng Puthy 1 ROYAL UNIVERSITY OF PHNOM PENH INSTITUTE OF FOREIGN LANGUAGES.

Lab 5: Item Analyses. Quick Notes Load the files for Lab 5 from course website –

How to Perform Simple Manual Item Analysis Dr. Belal Hijji, RN, PhD January 18, 2012.

Grading and Analysis Report For Clinical Portfolio 1.

1 Item Analysis - Outline 1. Types of test items A. Selected response items B. Constructed response items 2. Parts of test items 3. Guidelines for writing.

Writing Multiple Choice Questions. Types Norm-referenced –Students are ranked according to the ability being measured by the test with the average passing.

2. Format of stem 1. Balance of content level What is a “good” (or “bad”) multiple-choice item? (Trade secrets from a professional) Psychology and applied.

Fair and Appropriate Grading

Introduction to Item Analysis Objectives: To begin to understand how to identify items that should be improved or eliminated.

Review: Alternative Assessments Alternative/Authentic assessment Real-life setting Performance based Techniques: Observation Individual or Group Projects.

Tests and Measurements

Utilizing Item Analysis to Improve the Evaluation of Student Performance Mihaiela Ristei Gugiu Central Michigan University Mihaiela Ristei Gugiu Central.

Psychometrics: Exam Analysis David Hope

Reviewing, Revising and Writing Effective Social Studies Multiple-Choice Items 1 Writing and scoring test questions based on Ohio’s Academic Content Standards.

Dept. of Community Medicine, PDU Government Medical College,

1 Main achievement outcomes continued.... Performance on mathematics and reading (minor domains) in PISA 2006, including performance by gender Performance.

Norm Referenced Your score can be compared with others 75 th Percentile Normed.

Reviewing, Revising and Writing Mathematics Multiple- Choice Items 1 Writing and scoring test questions based on Ohio’s Academic Content Standards Reviewing,

Copyright © Springer Publishing Company, LLC. All Rights Reserved. DEVELOPING AND USING TESTS – Chapter 11 –

Items analysis Introduction Items can adopt different formats and assess cognitive variables (skills, performance, etc.) where there are right and.

Exam Analysis Camp Teach & Learn May 2015 Stacy Lutter, D. Ed., RN Nursing Graduate Students: Mary Jane Iosue, RN Courtney Nissley, RN Jennifer Wierworka,

Professor Jim Tognolini

Using Data to Drive Decision Making:

ARDHIAN SUSENO CHOIRUL RISA PRADANA P.

Assessment Instruments and Rubrics Workshop Series

assessing scale reliability

Classroom Analytics.

UMDNJ-New Jersey Medical School

Incorporating Active Learning in Foundational Courses

Greg Miller Iowa State University

Test Development Test conceptualization Test construction Test tryout

Assessment for Learning — Using ExamSoft for Formative Assessment

Constructing Exam Questions

Dept. of Community Medicine, PDU Government Medical College,

Using statistics to evaluate your test Gerard Seinhorst

Lies, Damned Lies & Statistical Analysis for Language Testing

Analyzing test data using Excel Gerard Seinhorst

EDUC 2130 Quiz #10 W. Huitt.

Tests are given for 4 primary reasons.

Presentation transcript:

Dan Thompson Oklahoma State University Center for Health Science Evaluating Assessments: Utilizing ExamSoft’s item-analysis to better understand student performance and our exams

Session Objectives At the completion of this session, participants will be able to: Identify the purpose of each item-analysis measure Evaluate individual exam items and overall exams using the appropriate statistics Improve exam items based on prior statistical performance Utilize exam data to construct well-balanced exams

Item-Analysis Indicates: Quality of the exam as a whole Question difficulty The results of high and low performers How each answer choice performs The correlation of exam takers’ performance on each item with their overall exam results

Exam Performance KR20 Scale of 0.00 – 1.00 Evaluates the performance and quality of the overall exam As scores increase, the exam is considered more consistent and reliable Licensure exams are expected to maintain KR20 scores >.8 For our class sizes, the goal should be KR20 scores >.7

Question Performance Diff(p) Item difficulty (p-value); Scale of 0.0 – 1.0 For a general bell curve, you want these items to be in the range (should have higher discriminators if in this range).3 and below is a very difficult question.8 and above is considered a very easy question Upper/Lower (27%) P-value of the upper and lower 27% of the class on that exam

Question Performance Discrimination Index Comparative analysis of the upper and lower 27% Disc. Index = upper 27% - lower 27% Scale of to 1.00 (the higher the better).3 and above are good discriminators are fair, but may need review 0 = no discrimination; all exam takers selected the correct answer Any negative value is considered a flawed item and should be removed or revised Indicates the lower performers scored better than the high performers on that item Low disc. index is appropriate for mastery questions Above.4 realistically indicates the difference between top and bottom performers – illustrates what the higher performers know that the lower performers do not know

Question Performance Point Biserial Measures correlation between exam takers’ responses on an individual question with how they performed on the overall exam Scale of – 1.00 A higher biserial indicates that exam takers that performed well on the exam also performed well on that specific item and exam takers that performed poorly on the exam also performed poorly on that item

Question Performance Point Biserial continued: A negative biserial indicates negative correlation – this question should definitely be reviewed When the biserial is low (near 0), there is little correlation between the performance of this item and the overall exam Mastery items lead to low biserials as most student answer correctly Average Answer Time The time students spent answering this specific question 72 seconds per question

Response Distribution Distractor Analysis We can identify which distractors are doing what we intend Helpful guidelines: Quality over quantity Distractors Use as many plausible distractors that can be written for that question Item-analysis is not affected much by the number of answer choices as long as there are at least three total plausible options

Response Distribution Helpful guidelines continued: Check the item distribution of the top and bottom 27% Start with items that have a high discrimination index Will indicate which distractors were most confusing to each group of students

Question Intention It is important to factor in the intention of each question when reviewing the item-analysis Mastery questions should have: Low biserials Low discrimination indexes High p-values Discrimination questions should have: High biserials High discrimination indexes Lower p-values

Examples Mastery Item Review Needed

Examples Removal/Revision Needed Discrimination Question

Conclusion KR20 >.8 Discrimination Index >.3 Biserial – the higher the better; hopefully >.2 P-value depends on intention of the question Check distractor stats to make sure they are doing what they are intending to do Utilize previous exam item-analysis when constructing new exams Helps create well balanced exams

References Dewald, Aaron. An Introduction to Item Analysis. Video retrieved from everywhere-introduction-item-analysishttp://learn.examsoft.com/video/numbers- everywhere-introduction-item-analysis Haladyna, T. M., & Downing, S. M. (1989). Validity of a taxonomy of multiple-choice item-writing rules. Applied Measurement in Education, 2(1), Starks, Jason. Exam Quality Through Use of a Psychometric Analysis – A Primer. Retrieved from file:///C:/Users/dpthomp/Downloads/Exam_Quality_Th rough_Use_of_Psychometric_Analysis_A_Primer.pdf file:///C:/Users/dpthomp/Downloads/Exam_Quality_Th rough_Use_of_Psychometric_Analysis_A_Primer.pdf

Thank you!!! Questions/Comments??