Measuring Outcomes Geoffrey T. Miller Associate Director, Research and Curriculum Development Division of Pehospital and Emergency Healthcare Gordon Center for Research in Medical Education University of Miami Miller School of Medicine
Session aims Discuss the importance of outcomes evaluation and challenges to traditional assessments Discuss the importance of validity, reliability and feasibility as it relates to assessment Discuss types of assessments and their application in healthcare education
A little terminology… Assessment and evaluation are often used interchangeably However for our purposes… Assessment = learner outcomes Evaluation = course/program outcomes
Why is assessment important?
Because… assessment: “Drives learning” Allows measures of individual and programmatic progress Fundamental to outcomes- or competency-based education Assures public that providers are competent Credentialing, privileging, licensure, board certification – high stakes for practitioner and patient/society All involve assessment of competence
Formula for the effective use of simulation Training Resources Trained Educators Curricular Institutionalization Effective Simulation- based Healthcare Education X X = Issenberg, SB. The Scope of Simulation-based Healthcare Education. Simulation in Healthcare. 2006.
Formula for effective outcomes measurement Defined Outcomes Instruments & Trained Evaluators Appropriate Simulator Effective Outcomes Measurement X X =
What are some challenges to traditional methods of assessment for healthcare providers?
Challenges in traditional assessments Ethical issues: “using” real pts (substitutes) Invasive procedures (patient safety) Sensitive tasks (cultural concerns, pt modesty) Problems using cadaveric tissue models Animal welfare issues
Challenges in traditional assessments Real patients for evaluation of physical exam skills Feasibility issues for large-scale examinations Standardized, perceived fairness issues in high-stakes settings Standardized patients (SPs) improve reliability, but validity issues exist: cannot mimic many physical findings
Challenges in traditional assessments Wide range of clinical problems, including rare and critical events Availability Cost Reliability, validity, feasibility
Developing outcome measurements
“Any road will get you there, when you don’t know where you are going”
Curricula development Analysis Define expected outcomes Design Development Implementation Evaluation
Defining outcomes Learners are more likely to achieve competency and mastery of skills if the outcomes are well defined and appropriate for the level of skill training Define clear benchmarks for learners to achieve Plain goals with tangible, measurable objectives Start with the end-goal in mind and the assessment metrics, then the content will begin to develop itself
Curricula/assessment process Teaching and Learning Curricular Development Define Outcomes +/- Refinement Assessment and Evaluation
Use of assessments in healthcare simulation Information Demonstration Practice Feedback Remediation Measurement Diagnosis Rosen, MA et al. Measuring Team Performance in Simulation-Based Training: Adopting Best Practices for Healthcare. Simulation in Healthcare 3:2008;33–41.
Preparing assessments What should be assessed? Every aspect of curriculum considered essential and/or has significant designated teaching time Should be consistent with learning outcomes that are established as the competencies students should master/perform at a given phase of study
Blueprinting X 5, 23 6, 19, 20 Global Objective Recognize a potential terrorist incident and initiate incident operations UM-ERT Module Obj. 2.3 Recognize and describe scene hazards and appropriate personal protective measures Florida Objective(s) Tier 1: I (L), III (D), (F), (N), IV (J), V (A), (D), VI (B) Learning Opportunity Lecture Tabletop Video Exercise Skill OSCE X Assessment Pre MCQ Post 5, 23 6, 19, 20
Clinical competence and performance “Competent performance” = requires acquisition of basic knowledge, skills & attitudes Competence = Application of specific KSAs Performance = “Translation of competence into action” “Can they do it? Do they do it?”
Possible outcome competencies Patient care Medical knowledge Practice-based learning and improvement Interpersonal and communication skills Professionalism Systems-Based Practice Knowledge Skills Attitudes
Knowledge competencies Cognitive knowledge (factual) Recall Comprehension Application Analysis Synthesis Evaluation
Skill competencies Skills Knowledge Communication Physical Exam Procedures Informatics Self Learning Time Management Problem Solving Skills
Attitude competencies Knowledge Attitudes Behavior Teamwork Professionalism Key Personal Qualities Motivation Attitudes Skills
Continuous process Knowledge Attitudes Skills
Relating Miller’s pyramid of competence to learning and assessment
Miller’s Pyramid of Competence Does Shows Knows How Knows Miller GE. The Assessment of Clinical Skills / Competence / Performance, Academic Medicine, 65:9, S63-S67.
Teaching and Learning “Knows” Opportunity Reading / Independent Study Lecture Computer-based Colleagues / Peers Does Shows Knows How Knows
Assessment of “Knows” Does Shows Knows How Knows Factual Tests
The Tools of “Knows” Multiple Choice Questions (MCQs) Short Answer True / False Matching (extended) Constructed Response Questions
Information Input (Facts) Factual Output (Answers) Example - MCQ FACT “Wheezes are continuous, musical, whistling sounds during difficult breathing such as in asthma, croup and other respiratory disorders.” Learning Opportunity Information Input (Facts) Factual Output (Answers) Assessment Q. Whistling sounds associated with an asthmatic patient are called? Rales B. Rhonchi C. Wheezes D. Vesicular ANSWER
Click on picture to play video Computer-based model Choose the best description of the patient’s finding: A. Myoclonus B. Partial Seizure C. Tic D. Fasciculations E. Tremor Click on picture to play video
Teaching and Learning - “Knows How” Opportunity Problem-based Ex. Tabletop Exercises Direct Observation Mentors Does Shows Knows How Knows
Assessment of “Knows how” Does Shows Knows How Knows Clinical Context Based Tests
The Tools of “Knows How” Multiple-choice question Essay Short answer Oral interview
Example – Clinical Context MCQ Which of the following is most likely the patients problem? A. Migraine B. Myasthenia gravis C. Multiple Sclerosis D. Ischemic Stroke E. Cerebral aneurysm 64-year-old man No past medical Hx 1 week of intermittent Headache Double vision R pupil dilated
Teaching and Learning - “Shows” Opportunity Skill-based Exercises Repetitive practice Small Group Role Playing Does Shows Knows How Knows
Assessment of “Shows” Performance Assessment Does Shows Knows How
The Tools of “Shows” Objective Structured Clinical Examination (OSCE) Standardized Patient-based
Variables in Clinical Assessment Examiner Student Clinical Assessment Patient Control as many variables as possible
Teaching and Learning - “Does” Shows Knows How Knows Learning Opportunity Experience
Assessment of “Does” Performance Assessment Does Shows Knows How Knows
The Tools of “Does” Undercover / Stealth / Incognito Standardized Patient-based Video Portfolio Service ratings (customer satisfaction)
Influences on clinical performance Does System related Individual related Competence Cambridge Model for delineating performance and competence Rethans JJ, et al. The relationship between competence and performance: implications for assessing practice performance, Medical Education, 36:901-909.
Assessments types Choose the appropriate assessment method: Formative Summative Self Peer
Assessment Formative Assessment Summative Assessment Lower stakes One of several, over time of course or program May be evaluative, diagnostic, or prescriptive Often results in remediation or progression to next level Summative Assessment Higher stakes Generally final of course or program Primary purpose is performance measurement Often results in a “Go, No-Go” outcome
Formative assessment example
Assessments - self Encourages responsibility for the learning process, fosters skills in making judgments as to whether work is of an acceptable standard – it improves performance. Most forms of assessment can be adapted to a self-assessment format (MCQs, OSCEs, and short answers) Students must be aware of standards required for competent performance.
Individual self-learning and assessment
Assessments - peer Enables learners to hone their skills in their ability to work with others and professional insight Enables faculty to obtain a view of students they do not see An important part of peer assessment is for students to justify the marks they award to others Justification can also be used as a component when faculty evaluates attitudes and professionalism.
Assessments - standard setting Should be set to determine competence Enables certification to be documented, accountable and defensible Appropriately set standards for an assessment will pass those students who are truly competent Standards should not be two low (false positives) to pass those who are incompetent, nor too high (false negative) to fail those who are competent.
Assessments - standard setting Those responsible in setting standards must also have a direct role in teaching students at the level being examined and assist in providing examination material
Assessments - standard setting Standards should be set around a core curriculum that includes the knowledge, skills and attitudes required of all students When setting a standard the following should be considered: What is assessed must reflect the core curriculum Students should be expected to reach a high standard in the core components of the curriculum (For instance an 80-90% pass mark of for the important core and 60-80% for the less important aspects.) Students should be required to demonstrate mastery of the core in one phase of the curriculum before moving on to the next part of the curriculum
Choosing appropriate assessment methods When choosing the assessment instrument, the following should be answered: Is it valid Is it reliable Is it feasible
Assessments - validity Are we measuring what we are supposed to be measuring Use the appropriate instrument for the knowledge, skill, or attitude you are testing The major types of validity should be considered (content, predictive, and face)
Assessments - reliability Does the test consistently measure what it is supposed to be measuring Types of reliability: Inter-rater (consistency over raters) Test-retest (consistency over time) Internal consistency (over different items/forms)
Reliability as Consistency Archer 1 hits bulls eye every time. Archer 2 hits outer ring in same spot every time. NB: same analogy used in validity section Both archers are equally reliable.
Inter-rater Reliability Multiple judges code independently using the same criteria Reliability = raters code same observations into same classification Examples Medical record reviews Clinical skills Oral examinations
Factors Influencing Reliability Test length Longer tests give more reliable scores Group homogeneity The more heterogeneous the group, the higher the reliability Objectivity of scoring The more objective the scoring, the higher the reliability
Validity is accuracy Both archers are equally reliable Archer 1 hits bulls eye every time Archer 2 hits outer ring in same spot every time The problem in educational measurement is that we can seldom “see” the target, e.g., attitudes, knowledge, professionalism as targets are “invisible targets” and validity claims are based on inferences and extrapolations from indirect evidence that we have hit the target. Both archers are equally reliable Validity = quality of archer’s hits
Reliability and Validity Reliable and Valid Reliable, not valid Not reliable, not valid
Improving reliability and validity Base assessment on outcome/objectives- event triggers- observable behavior- behavioral rating-assess against competence Define: Low-medium-high performance Use of rubric or rating metric Use (video) training examples of performance Employ quality assurance/improvement system
Assessments - feasibility Is the administration and taking of the assessment instrument feasible in terms of time and resources The following questions should be considered: How long will it take to construct the instrument? How much time will be involved with the scoring process? Will it be relatively easy to interpret the scores and produce the results? Is it practical in terms of organization? Can quality feedback result from the instrument? Will the instrument indicate to the students the important elements within the course? Will the assessment have a beneficial effect in terms of student motivation, good study habits and positive career aspirations?
Practicality Number of students to be assessed Time available for the assessment Number of staff available Resources/equipment available Special accommodations
Assessment instruments
Assessments - instruments Be aware of the types of assessment instruments available as well as the advantages and disadvantages of each It is important, if feasible, to use more than one assessment instrument and more than one assessor when looking at skills and attitudes
Assessments – knowledge instruments Objective tests (short answer, true/false, matching, multiple choice) Objective Structured Clinical Evaluations (OSCEs) Constructed response questions Rating scales (used on clerkships)
Assessments – skill instruments Objective tests (Simulation based) OSCEs Constructed response questions Critical reading papers (interpreting literature) Checklists Rating Scales Portfolios (self-evaluation, time management)
Assessments – attitude instruments Portfolios Essays / Modified essay questions OSCEs Checklists Rating scales Patient management problems Short/long case assessments
Assessment Metrics Procedural or Check List assessment Global Rating assessment
Assessment Metrics Procedural or Check List assessment BCLS Y N Open Airway Check Breathing BCLS Y N Open Airway (< 5 sec of LOC) Check Breathing (< 5 sec of Airway) BCLS Y N Open Airway Check Breathing A Rating Score +1 -1 *Assist
Assessment Metrics Global Rating assessment Code Blue P F CPR and ACLS CPR <1(low) - 5(Hi)> points ACLS <1(low)- 5(Hi)> points Pts. Code Blue H M CPR ACLS L Rating Score +1 -1
Review Assessment drives learning Clearly define the desired outcome, ensure that it can be measured Consider the “threats” to the effectiveness of the measurement Feedback to individual candidates Feedback to training programs
Questions and discussion