Alice CHUANG, MD Department of Obstetrics and Gynecology University of North Carolina-Chapel Hill Chapel Hill, NC AOE Basic Teaching Skills Curriculum.

Slides:



Advertisements
Similar presentations
Ed-D 420 Inclusion of Exceptional Learners. CAT time Learner-Centered - Learner-centered techniques focus on strategies and approaches to improve learning.
Advertisements

Assessing Student Performance
Assessment types and activities
Evaluation Overview - Basics. Purpose of Testing Diagnostic Formative Summative.
Performance Assessment
Assessment Adapted from text Effective Teaching Methods Research-Based Practices by Gary D. Borich and How to Differentiate Instruction in Mixed Ability.
Assessment Assessment should be an integral part of a unit of work and should support student learning. Assessment is the process of identifying, gathering.
Designing Scoring Rubrics. What is a Rubric? Guidelines by which a product is judged Guidelines by which a product is judged Explain the standards for.
Chapter Fifteen Understanding and Using Standardized Tests.
Grading. Why do we grade? To communicate To tell students how they are doing To tell parents how students are doing To make students uneasy To wield power.
Workplace-based Assessment. Overview Types of assessment Assessment for learning Assessment of learning Purpose of WBA Benefits of WBA Miller’s Pyramid.
Consistency of Assessment
Overview: Competency-Based Education & Evaluation
Assessment of Clinical Competence in Health Professionals Education
Comparison: Traditional vs. Outcome Project Evaluative Processes Craig McClure, MD Educational Outcomes Service Group University of Arizona December 2004.
Assessing and Evaluating Learning
Standardized Test Scores Common Representations for Parents and Students.
Questions to check whether or not the test is well designed: 1. How do you know if a test is effective? 2. Can it be given within appropriate administrative.
Barbara J Pettitt, MD Emory University Dept. of Surgery Atlanta GA April 24, 2015.
OCTOBER ED DIRECTOR PROFESSIONAL DEVELOPMENT 10/1/14 POWERFUL & PURPOSEFUL FEEDBACK.
6 th semester Course Instructor: Kia Karavas.  What is educational evaluation? Why, what and how can we evaluate? How do we evaluate student learning?
Principles of Assessment
Fundamentals of Assessment Todd L. Green, Ph.D. Associate Professor Pharmacology, Physiology & Toxicology PIES Seminar
Assessment Tools. Contents Overview Objectives What makes for good assessment? Assessment methods/Tools Conclusions.
Classroom Assessment A Practical Guide for Educators by Craig A
Evaluation: A Challenging Component of Teaching Darshana Shah, PhD. PIES
Classroom Assessment and Grading
Portfolio Assessment in Clerkship Michelle Gibson - Geriatrics (thanks to Chris Frank and Melissa Andrew too)
ASSESSMENT OF STUDENT LEARNING Manal bait Gharim.
Classroom Assessment LTC 5 ITS REAL Project Vicki DeWittDeb Greaney Director Grant Coordinator.
Curriculum and Learning Omaha Public Schools
1 An Introduction to Language Testing Fundamentals of Language Testing Fundamentals of Language Testing Dr Abbas Mousavi American Public University.
The World of Assessment Consider the options! Scores based on developmental levels of academic achievement Age-Equivalent scores.
Learner Assessment Win May. What is Assessment? Process of gathering and discussing information from multiple sources to gain deep understanding of what.
Understanding Meaning and Importance of Competency Based Assessment
Measuring Complex Achievement
EDU 385 Education Assessment in the Classroom
Assessment in Education Patricia O’Sullivan Office of Educational Development UAMS.
Teaching Today: An Introduction to Education 8th edition
Student assessment AH Mehrparvar,MD Occupational Medicine department Yazd University of Medical Sciences.
Chap. 2 Principles of Language Assessment
Instructional Plan | Slide 1 AET/515 Instructional Plan December 17, 2012 Kevin Houser.
Professional Certification Professional Certification October 11, 2007 Standard: Effective Teaching Criteria 1(b) Using a variety of assessment strategies.
Patricia A. Mahoney, MSN, RN, CNE
Developing an Assessment System B. Joyce, PhD 2006.
Assessment Tools.
Performance-Based Assessment HPHE 3150 Dr. Ayers.
Assessing Your Learner Lawrence R. Schiller, MD, FACG Digestive Health Associates of Texas Baylor University Medical Center, Dallas.
Comparison: Traditional vs. Outcome Project Educational Paradigms Craig McClure, MD Educational Outcomes Service Group University of Arizona December 2004.
Classroom Assessment (1) EDU 330: Educational Psychology Daniel Moos.
Assessment and Testing
Assessment Information from multiple sources that describes a student’s level of achievement Used to make educational decisions about students Gives feedback.
Assessment. Workshop Outline Testing and assessment Why assess? Types of tests Types of assessment Some assessment task types Backwash Qualities of a.
Total Teaching (Feedback and assessment ) Paul Gibson.
Georgia will lead the nation in improving student achievement. 1 Georgia Performance Standards Day 3: Assessment FOR Learning.
Alternative Assessment Chapter 8 David Goh. Factors Increasing Awareness and Development of Alternative Assessment Educational reform movement Goals 2000,
Criterion-Referenced Testing and Curriculum-Based Assessment EDPI 344.
Assessing Learners The Teaching Center Department of Pediatrics UNC School of Medicine The Teaching Center.
Assessment Ice breaker. Ice breaker. My most favorite part of the course was …. My most favorite part of the course was …. Introduction Introduction How.
Evaluation, Testing and Assessment June 9, Curriculum Evaluation Necessary to determine – How the program works – How successfully it works – Whether.
21 st Century Learning and Instruction Session 2: Balanced Assessment.
Language Assessment. Evaluation: The broadest term; looking at all factors that influence the learning process (syllabus, materials, learner achievements,
Assessment Literacy and the Common Core Shifts. Why Be Assessment Literate? ED Test and Measurement 101 was a long time ago Assure students are being.
Monitoring and Assessment Presented by: Wedad Al –Blwi Supervised by: Prof. Antar Abdellah.
 Teaching: Chapter 14. Assessments provide feedback about students’ learning as it is occurring and evaluates students’ learning after instruction has.
CERTIFICATE IN ASSESSING VOCATIONAL ACHIEVEMENT (CAVA) Unit 1: Understanding the principles and practices of assessment.
Defining Grades in the Surgery Clerkship Jeremy M. Lipman, MD MetroHealth Medical Center Case Western Reserve University School of Medicine.
Copyright © 2005 Avicenna The Great Cultural InstituteAvicenna The Great Cultural Institute 1 Student Assessment.
ASSESSMENT METHODS – Chapter 10 –.
Understanding and Using Standardized Tests
Presentation transcript:

Alice CHUANG, MD Department of Obstetrics and Gynecology University of North Carolina-Chapel Hill Chapel Hill, NC AOE Basic Teaching Skills Curriculum April 16, 12:00 PM, Bondurant G010 Fundamentals of Assessment and Grading APGO Clerkship Directors’ School APGO Clerkship Directors’ School

 Neither I nor my spouse has any financial interests to disclose related to this talk.

 Understand reliability and validity  Contrast formative and summative evaluation  Compare and contrast norm-referenced and criterion referenced assessments  Improve delivery of feedback  Understand the NBME exam  Be familiar with different testing formats, their uses and their limitations Objectives

 Validity: Are we measuring what we think we’re measuring  Content: Does the instrument measure the depth and breadth of the content of the course? Does it inadvertently measure something else?  Construct: Does the evaluation criteria or grading construct allow for true measurement of the knowledge, skills or attitudes taught in the course? Is any part of the grading construct irrelevant?  Criterion: Does the outcome correlate with true competencies? Relate to an important current or future events? Is the assessment relevant to future performance? Terminology

 Validity  Content: A summative ob/gyn test which covered only obstetrics  Construct: You allow students to use their textbook for a knowledge-based multiple choice test of foundational information on prenatal care.  Criterion: New Coke v. Old Coke Examples

 Reliability: Are our measurements consistent? The score should be the same no matter when it was taken, who scored it, or when it was scored.  Interrater reliability: Is a student’s score consistent between evaluators?  Intrarater reliability: Is a student’s score consistent with the same rater even if rated under different circumstances?  Scoring rubric: standardized method of grading to increase interrater and intrarater reliability Terminology

In general, if you repeat the same assessment, will you get the same answer?  Interrater: 3 individuals are asked to go to the beach and estimate how many seagulls they see from 6-7AM and come up with 200, 800 and  Intrarater: A particular food critic always gives low scores for food quality if the server is female. Examples:

Poor Candidate 0 points Fair Candidate 1 points Good Candidate 2 points Superior Candidate 3 points Singing Skills Sings with as much expression as a wet noodle, cannot identify which tune candidate is singing, also cannot identify what the lyrics of song are secondary to poor pronunciation Minimally expressive, pitch off significantly on occasion, diction unclear at times Very expressive, sings on pitch most of the time with minor errors, diction clear most of the time Artistically expressive, sings on pitch, diction clear Dancing Skills Has 2 left feet, unable to learn new steps and continues to dance like MC Hammer despite different choreography demonstrated Missteps despite multiple attempts, no artistic expression in dance moves, unable to learn new choreography after 3 demonstrations Occasionally missteps, but overall dance steps are accurate, adapts choreography fairly rapidly, Quick and nimble, dances artistically, able to learn new choreography quickly. Enthusiasm for show CHOIR Freely admits not knowing what GLEE is Endorses enjoyment of GLEE, but unable to identify favorite character Has watched 70% of GLEE episodes Has seen every episode of GLEE, all GLEE albums confirmed in iTUNES library, has been to GLEE LIVE each summer Examples: Show Choir Audition Rubric

 Formative: on-going assessment, designed to help improve educational program as well as learner progress  Summative: designed to evaluate student overall performance at end of educational phase and evaluate effectiveness of teaching Formative v. summative assessments

 Formative: short multiple choice exam written in house that is pass/fail; answers are reviewed with class at end of testing session  Summative: NBME exam Examples

 ED30: The directors of all courses and clerkship must design and implement a system of formative and summative evaluation of student achievement in each course and clerkship. Those responsible for the evaluation of student performance should understand the uses and limitation of various test formats, the purposes and benefits of criterion-referenced vs. norm-referenced grading, reliability and validity issues, formative vs. summative assessment, etc…. Formative v. summative assessments

 ED31: Each student should be evaluated early enough during a unit of study to allow time for remediation  ED32: Narrative descriptions of student performance and of non-cognitive achievement should be included as part of evaluations in all required courses and clerkships where teacher-student interaction permits this form of assessment. Formative v. summative assessments

Uses for assessments FormativeSummative PurposeFeedback for learningCertification/Grading Breadth of scope Narrow focus on specific objectives Broad focus on general goals ScoringExplicit feedbackOverall performance Learner affective response Little anxiety Moderate to high anxiety Target audienceLearnerSociety

Characteristics of feedback Effective Feedback: given with the goal of improvement  timely  honest  respectful  clear  issue-specific  objective  supportive  motivating  action-oriented  solution-oriented Destructive Feedback: unhelpful  accusatory  personal  judgmental  subjective It also  undermines the self-esteem of the receiver  leaves the issue unresolved  the receiver is unsure how to proceed.

 When you…  You give the impression…  I would stop…  I would recommend…instead Feedback…from APGO/CREOG 2011

 Norm-referenced  Purpose is to classify students in order of achievement from low to high  Allow comparisons of students  May not give accurate information regarding student abilities  Half of the students should score above midpoint score and the other half should score below midpoint score Norm-referenced v. criterion- referenced assessments Rickets C. A plea for the proper use of criterion-referenced tests in medical assessment. Med Educ, Vol 43, Issue 12.

 Criterion-referenced  Purpose is to evaluate students knowledge and skills compared to a pre-determined goal performance level  Gives information about a student’s achievement of certain objectives  Should be possible for everyone to earn a passing score Norm-referenced v. criterion- referenced assessments Rickets C. A plea for the proper use of criterion-referenced tests in medical assessment. Med Educ, Vol 43, Issue 12.

 Norm-referenced: Soccer tryouts where 11 players are chosen out of 40  Criterion-referenced: Test for driver’s license Example

 Be sure your assessment is appropriately norm- referenced or criterion referenced.  Be sure that your assessment is designed with this in mind.  Most assessments in medical education are criterion-referenced.  Norm-referenced tests should emphasize variability; criterion-referenced tests should emphasize accuracy of tested material. Norm-referenced v. criterion- referenced assessments

 Exams  Developed by committees and content experts  Same protocol used to build Step 1 and Step 2  In general  Subject exams provided to all 130 LCME accredited medical school is US  8 Canadian medical schools  8 osteopathic medical school  22 international medical schools NBME

 Scaled to have a mean of 70 and SD of 8 based on 9000 first-time test takers from 80+ schools who took exam as end-of-clerkship exam in  Scores do not reflect percentage of questions answered correctly. NBME

A score of 60 in the fourth quarter means that 2% of the examinees in the fourth quarter scored 60 or below! NBME: What do those scores mean? Score Total year Q1Q2Q2Q3Q4Q4 93 or above

NBME: Academic purpose for exam % Advanced placement5 Course/clerkship95 Year-end12 Make-up21 Minimal competence44 Identify at risk students23 Practice for USMLE47 Promotion requirement37 Review course1 Student self-assessment26 Other4 Total responses:78

NBME: Weight given the subject exam Weight given the subject exam% 1-10% % % % %13 >50%0 Total number responding70

NBME 2008 Clerkship Survey Results Assessment/Evaluation MethodOb/gyn (%) Computer Case Simulations0.5 Subject Exam30 School’s MCQ Exam9 Observation and evaluation by residents28 Observation and evaluation by faculty26 Oral exam14 OSCE12 Peer evaluation1 Standardized patient exam3 Other18 Total number responding81

 2004 and 2009 survey of performance guidelines across clerkship  Recommend setting an absolute versus a relative standard for performance  Angoff Procedures: item-based, judges provide guess of minimally proficient examinees that answer each question correctly  Hofstee Method: judges determine minimum and maximum scores for passing and percentage of failures…then plotted against a graph made up of exam score and failure rate NBME

 Multiple choice exam (MCQ)  Objective structured clinical examination (OSCE)  Oral examination  Direct observation  Simulation  Standardized patient  Patient/procedure log  Medical record reviews  Written essay questions Testing Formats Casey et al, To the point: reviews in medical education – the Objective Structured Clinical Examination. AJOG, Jan 2009.

 Use distractors which could plausibly represent correct answer  Use a question format, not complete-the-statement format  Emphasize higher-level thinking, not strict memorization  Keep option length consistent within a question  Balance the placement of the correct answer  Use correct grammar  Avoid clues to the correct answer  Highly reliable and valid for assessing knowledge Testing format: MCQ

 Examinees rotate through circuit of stations (5-10 minutes each)  One-on-one examination (with examiner or trained or simuated patient)  List of criteria for successful completion of each station  Each station test a specific skill or competency  Good for examining higher-order skills, clinical and technical skills  Requires large amount of resources Testing format: OSCE

 Portfolio based: similar to case-based portion of Oral Boards  Poor inter-rater and intra-rater reliability  Scores higher when scored live verses on video  Teaching students how to do better on oral exam does not improve scores  Practicing oral exams does improve scores  Mock public oral exam improves performance  Limitations  Halo effect (grade reflects not only performance on exam but also previous experience)  Subconscious consensus grading: examiners take subconscious cues from each other. Testing format: Oral Exam Burch & Seggie, 2008; Kearney et al, 2001; Buchard et al, 2007; Jacobsohn et al, 2006

 Is an oral exam justified? Is there an advantage?  Does the material lend itself to open questioning?  How will communication skills, delivery of information be graded? Will only content be graded?  Is the examiner experienced? Will he/she skew grades in any way?  How will you prepare students for the exam?  Is there enough time for every student to examine them adequately?  How much prompting/assistance is allowed for oral examination? How much time will you allow for “thinking?” How will you ensure consistency in these areas for all examinees? Testing format: Oral Exam

 Formalized criteria  Various observers  True-to-life clinical setting (versus simulated)  Numerical scores  Comment anchored  Improve reliability with multiple perspectives  Consider 360 evaluation (including self, patient and other staff members) Testing format: Direct observation

Testing format MCQOSCEDirect obsOral exam Content Construct Criterion Reliability Formative YYYY Summative YYYY Norm- referenced YNNN Criterion- referenced YYYY

Be sure your assessment  Provides reliable data  Provides valid data  Provides valuable data  Is feasible  Can be incorporated into the systems in place (hospital, clinic, curriculum, etc)  Is consistent with course objectives  Utilizes multiple instruments, multiple assessors and multiple points of assessment  Aligns with pre-specified criteria  Is fair General rules of thumb Lynch and Swing. Key Considerations for Selecting Assessment Instruments and Implementing Assessment Systems. ACGME.

Bond, Linda A. (1996). Norm- and criterion-referenced testing. Practical Assessment, Research & Evaluation, 5(2). Accessed at Burch VC, Seggie JL. Use of a structured interview to assess portfolio-based learning. Med Ed 2008: 42: Burchard K et al. Is it live or is it Memorex? Student oral examinatinos and the use of video for additional scoring. Am J Surg. 193 (2007), Casey et al, To the point: reviews in medical education – the Objective Structured Clinical Examination. AJOG, Jan Jacobsohn E, Kock PA, Avidan M. Poor inter-rater reliability on mock anesthesia oral examinations. Kearney RA et al. The inter-rater and intra-rater reliability of a new Canadian oral examinatino format in anesthesia is fair to good. Can J Anesth 2002; 49:3, Lynch and Swing. Key Considerations for Selecting Assessment Instruments and Implementing Assessment Systems. ACGME. Metheny WP, Espey EL, Bienstock J, et al. To the point: Medical education reviews evaluation in context: Assessing learners, teachers, and training programs. Am J Obstet Gynecol. 2005;192(1): Moskal, Barbara M. & Jon A. Leydens (2000). Scoring rubric development: validity and reliability. Practical Assessment, Research & Evaluation, 7(10). Retrieved December 29, 2009 from Rickets C. A plea for the proper use of criterion-referenced tests in medical assessment. Med Educ, Vol 43, Issue 12. References

14 Rules for Writing Multiple Choice Questions. Brigham Young University 2001 Annual Conference. Accessed at Choice%20Questions.pdfhttp://testing.byu.edu/info/handbooks/14%20Rules%20for%20Writing%20Multiple- Choice%20Questions.pdf Formative vs. Summative Assessments. Classroom Assessment. Accessed at: NBME 2008 Clinical Clerkship Director Survey Results. Accessed at _p_state=maximized&p_p_mode=view&p_p_col_id=column- 1&p_p_col_count=1&_62_INSTANCE_dOGM_struts_action=%2Fjournal_articles%2Fview&_62_INS TANCE_dOGM_keywords=&_62_INSTANCE_dOGM_advancedSearch=false&_62_INSTANCE_dO GM_andOperator=true&_62_INSTANCE_dOGM_groupId=1172&_62_INSTANCE_dOGM_searchAr ticleId=&_62_INSTANCE_dOGM_version=1.0&_62_INSTANCE_dOGM_name=&_62_INSTANCE_d OGM_description=&_62_INSTANCE_dOGM_content=&_62_INSTANCE_dOGM_type=&_62_INST ANCE_dOGM_structureId=&_62_INSTANCE_dOGM_templateId=&_62_INSTANCE_dOGM_status =approved&_62_INSTANCE_dOGM_articleId= Objective Structured Clinical Examination. Wikipedia. Accessed at Reliability and Validity. Classroom Assessment. Accessed at: Talk about teaching: Significant issues in Oral Examinations.Talk about teaching: Significant issues in Oral Examinations. Contributed by Meryl Carlson, Concordia College, Moorhead, MN. Accessed at Carlson References