Download presentation
Published byHortense Fowler Modified over 9 years ago
1
Evaluating Outcomes Marshall (Mark) Smith, MD, PhD
Director, Simulation and Innovation Banner Health
2
Acknowledgement and Appreciation
Lecture today is adapted from presentation given at last Laerdal SUN meeting by Geoffrey T. Miller, and some of the slides today are used/modified from that presentation. Geoffrey T. Miller Associate Director, Research and Curriculum Development Division of Pehospital and Emergency Healthcare Gordon Center for Research in Medical Education University of Miami Miller School of Medicine
3
The wisdom of a group is greater than that of a few experts…
4
Session Aims Discuss the importance and use of outcomes evaluation and challenges to traditional assessments Discuss some learning models that facilitate developing assessments Discuss the importance of validity, reliability and feasibility as it relates to assessment Discuss types of assessments and their application in healthcare education
5
Evaluation Systematic determination of merit, worth, and significance of something or someone using criteria against a set of standards. Wikipedia, 2009
6
Assessment Educational Assessment is the process of documenting, usually in measurable terms, knowledge, skills, attitudes and beliefs. Assessment can focus on the individual learner,…… Wikipedia, 2009
7
Assessment vs Evaluation
Assessment is about the progress and achievements of the individual learners Evaluation is about the learning program as a whole Tovey, 1997
8
So why measure anything?
Why not just teach?
9
Measurement What is measured, improves
You Can't Manage What You Don't Measure You tend to improve what you measure For years I have seen instructors that want to “teach” what they think is important, but without any concept of what is really “learned”!
10
Measurement Promotes learning
Allows evaluation of individuals and learning programs Basis of outcomes- or competency-based education Documentation of competencies
11
Measurements in Future
Credentialing Privileging Licensure Board certification High stake assessments for practitioners All involve assessment of competence
12
What are the challenges today of traditional methods of measurement/assessment for healthcare providers?
13
Challenges in traditional assessments
Using actual (sick) patients for evaluation of skills Cannot predict nor schedule clinical training events Compromise of quality of patient care, safety Privacy concerns Patient modesty Cultural issues Prolongation of care (longer procedures, etc)
14
Challenges in traditional assessments
Challenges with other models Cadaveric tissue models Animal models / labs
15
Challenges in traditional assessments
Feasibility issues for large-scale examinations Standardized, perceived fairness issues in high-stakes settings Standardized patients (SPs) improve reliability, but validity issues exist: cannot mimic many physical findings
16
Challenges in traditional assessments
Wide range of clinical problems, including rare and critical events Availability Financial cost Adequate resources Reliability, validity, feasibility
17
Kirkpatrick's Four Levels of Evaluation
Reaction Learning Performance Results 1994 reaction
18
Kirkpatrick's Four Levels of Evaluation
1. Reaction Measures only one thing – learners perception Not indicative of any skills, performance Success is critical to success of program Relevance to learner important 1994 reaction
19
Kirkpatrick's Four Levels of Evaluation
2. Learning This is where learner changes Requires pre and post testing Evaluation at this step is through learner assessment First level to measure change in learner! 1994 reaction
20
Kirkpatrick's Four Levels of Evaluation
3. Performance (Behavior) Action that is performed Consequence of behavior is performance Traditionally involves measurement in the workplace Transfer of learning from classroom to work environment 1994 reaction
21
Kirkpatrick's Four Levels of Evaluation
4. Results Clinical and quality outcomes Difficult to measure in healthcare Perhaps better in team training Often ROI that management wants 1994 reaction
22
Kirkpatrick's Four Levels of Evaluation
Reaction Learning Performance Results Increasing complexity Increasing difficulty to measure, time consuming Increasing value! First three measures soft Sometimes management wants soft data…for retention Second level really measures KSA
23
Kirkpatrick's Four Levels of Evaluation
Reaction Learning Performance Results 1994 reaction
24
How do we start to develop outcome measurements
26
Learning Outcomes Goals Course Objectives Projected Outcomes
Expected Outcomes
27
Development of Curricula
Analysis Clearly define and clarify desired outcomes* Design Development Implementation Evaluation ADDIE
28
Defining Assessments Outcomes are general, objectives are specific and support outcomes If objectives are clearly defined and written, questions and assessments nearly write themselves
29
Defining Outcomes Learners are more likely to achieve competency and mastery of skills if the outcomes are well defined and appropriate for the level of skill training Define clear benchmarks for learners to achieve Clear goals with tangible, measurable objectives Start with the end-goal in mind and the assessment metrics, then the content will begin to develop itself
30
Role of Assessment in Curricula Design
Course Teaching and learning Assessment and evaluation Refine Learner and Course Outcomes Modify curricula/assessments Assess learners
31
Use of assessments in healthcare simulation
Information Demonstration Practice Feedback Remediation Measurement Diagnosis Rosen, MA et al. Measuring Team Performance in Simulation-Based Training: Adopting Best Practices for Healthcare. Simulation in Healthcare 3:2008;33–41.
32
Preparing assessments
What should be assessed? Any part of curriculum considered essential and/or has significant designated teaching time Should be consistent with learning outcomes that are established as the competencies for learners Consider weighted assessments
33
Clinical competence and performance
Competent performance requires acquisition of basic knowledge, skills & attitudes Competence = Application of specific KSAs Performance = Translation of competence into action
34
Three Types of Learning (Learning Domains)
Bloom's Taxonomy Cognitive: mental skills (Knowledge) Psychomotor: manual or physical skills (Skills) Affective: growth in feelings or emotional areas (Attitude) Committee of colleges in 1956, led by Benjamin bloom
35
Three Types of Learning
Bloom's Taxonomy Cognitive = Knowledge K Psychomotor = Skills S Affective = Attitude A
36
Bloom’s Taxonomy – Knowledge
37
Bloom’s Taxonomy – Knowledge
38
Bloom’s Taxonomy – Knowledge
39
Bloom’s Taxonomy – Knowledge
40
Bloom’s Taxonomy – Knowledge
41
The Anti – Blooms…
42
Bloom’s Taxonomy – Skills
Bloom’s committee did not propose a compilation of the psychomotor domain model, but others have since.
43
Bloom’s Taxonomy – Attitude
Five Major Categories Receiving phenomena Responding to Phenomena Valuing Organization Internalizing values
44
Knowledge Competencies
Cognitive knowledge (factual) Recall Comprehension Application Analysis Synthesis Evaluation
45
Skill competencies Skills Knowledge Communication Physical Exam
Procedures Informatics Self Learning Time Management Problem Solving Skills
46
XAttitude competencies
Knowledge Attitudes Behavior Teamwork Professionalism Key Personal Qualities Motivation Attitudes Skills
47
Continuous process Knowledge Attitudes Skills
48
Possible Outcome Competencies (GME Based)
Patient care Medical knowledge Practice-based learning and improvement Interpersonal and communication skills Professionalism Systems-Based Practice Knowledge Skills Attitudes
49
So we know what we want to measure, but how do we do that?
51
Miller’s Pyramid of Competence
Moving Up the Pyramid: Assessing Performance in the Clinic More that 18 years ago, George Miller introduced a framework for the assessment of medical students and residents, “Miller’s Pyramid”1 (Figure 1). In the accompanying address to the Association of American Medical Colleges, he advocated the evaluation of learners for their skills and abilities in the 2 top cells of the pyramid, in the domains of action, or performance, reflecting clinical reality. Miller argued that the demonstration of competence in these higher domains strongly implies that a student has already acquired the prerequisite knowledge, or Knows, and the ability to apply that knowledge, or Knows How, that make up the base of the pyramid. Basic clinical skills (Shows How) are those that can be measured in an examination situation such as an objective structured clinical examination (OSCE). However, the professionalism and motivation required to continuously apply these in the real setting (Does) must be observed during actual patient care. View larger version: In this page In a new window Figure 1. Miller’s pyramid1. From Academic Medicine 1990;65:S63–7; with permission from Wolters Kluwer Health. The component that Miller argued is the most vital aspect of measurement, what the learner does in clinical practice, has been the most difficult to capture. Almost 2 decades later, we are still struggling with the need to develop reliable and valid methods of assessing learners in the clinical setting. In the meantime, there have been many advances in the lower echelons of the pyramid. In the domains of Know and Know How, the Medical Council of Canada2 and the National Board of Medical Examiners3 have made great strides in the art of the multiple choice examination, the Key Feature examination, and computer-based, adaptive examination. In a pair of landmark publications, Tamblyn, et al have provided good evidence of the predictive validity of these assessments to outcomes in clinical practice4,5. The OSCE examination has become so ubiquitous that it has been claimed to define the expectations of practice itself6. In the name of reliability, the standardized patient has overtaken the real patient for the purposes of certifying examinations7. However, as outlined in the article by Susan Humphrey-Murto, et al in this issue of The Journal, the quality of assessment in the clinical setting lags far behind8. They state that the most frequently used instrument, the Intraining Evaluation Report (ITER), is completed by the resident or clerkship director, who may have had little personal experience of the learner, and at a time removed from many of those observations. This leads to a migration towards the center of the ubiquitous Likert scale, as the director is reluctant to label the student as being either exceptional or substandard in any specific item. Given that these forms are usually completed at the end of the rotation, and lack information clearly anchored to the performance of the learner, they have little value as a formative, or feedback, instrument that might contribute to the student’s education. Norcini, J. J BMJ 2003;326: Copyright ©2003 BMJ Publishing Group Ltd.
52
Miller’s Pyramid Top two cells of the pyramid, in the domains of action, or performance, reflect clinical reality The professionalism and motivation required to continuously apply these in the real setting must be observed during actual patient care.
53
Miller’s Pyramid Top two levels most difficult to measure
Quality of assessment in the clinical setting lags far behind In training Evaluation Reports and Likert Scale Little value as formative, or feedback, instrument that might contribute to the learner’s education
54
Miller’s Pyramid of competence for learning and assessment
55
Miller’s Pyramid of Competence
Does Shows Knows How Knows Behavior Cognition Miller GE. The Assessment of Clinical Skills / Competence / Performance, Academic Medicine, 65:9, S63-S67.
56
Teaching and Learning “Knows”
Opportunity Reading / Independent Study Lecture Computer-based Colleagues / Peers Does Shows Knows How Knows Miller GE. The Assessment of Clinical Skills / Competence / Performance, Academic Medicine, 65:9, S63-S67.
57
Assessment of “Knows” Factual Tests Does Shows Knows How Knows
Miller GE. The Assessment of Clinical Skills / Competence / Performance
58
Assessment Tools for “Knows”
Multiple Choice Questions (MCQs) Short Answer True / False Matching (extended) Constructed Response Questions
59
Teaching and Learning - “Knows How”
Opportunity Problem-based Ex. Tabletop Exercises Direct Observation Mentors Does Shows Knows How Knows Miller GE. The Assessment of Clinical Skills / Competence / Performance
60
Assessment of “Knows how”
Does Shows Knows How Knows Clinical Context Based Tests Miller GE. The Assessment of Clinical Skills / Competence / Performance
61
The Tools of “Knows How”
Multiple-choice question Essay Short answer Oral interview
62
Teaching and Learning - “Shows”
Opportunity Skill-based Exercises Repetitive practice Small Group Role Playing Does Shows Knows How Knows Miller GE. The Assessment of Clinical Skills / Competence / Performance
63
Assessment of “Shows” Performance Assessment Does Shows Knows How
Miller GE. The Assessment of Clinical Skills / Competence / Performance
64
The Tools of “Shows” Objective Structured Clinical Examination (OSCE)
Standardized Patient-based
65
Variables in Clinical Assessment
Examiner Student Clinical Assessment Patient CONTROL VARIABLES as much as possible…
66
Teaching and Learning - “Does”
Shows Knows How Knows Learning Opportunity Experience Miller GE. The Assessment of Clinical Skills / Competence / Performance
67
Assessment of “Does” Performance Assessment Does Shows Knows How Knows
Miller GE. The Assessment of Clinical Skills / Competence / Performance
68
The Tools of “Does” Undercover / Stealth / Incognito Standardized Patient-based Videos of performance Portfolio of learner Patient Satisfaction Surveys
69
Influences on clinical performance
Does System related Individual related Competence Cambridge Model for delineating performance and competence Rethans JJ, et al. The relationship between competence and performance: implications for assessing practice performance, Medical Education, 36:
70
Reliability and Validation
72
Assessments - Reliability
Does the test consistently measure what it is supposed to be measuring? Types of reliability: Inter-rater (consistency over raters) Test-retest (consistency over time) Internal consistency (over different items/forms)
73
Inter-rater Reliability
Multiple judges code independently using the same criteria Reliability = raters code same observations into same classification Examples Medical record reviews Clinical skills Oral examinations
74
Factors Influencing Reliability
Test length Longer tests give more reliable scores Group homogeneity The more heterogeneous the group, the higher the reliability Objectivity of scoring The more objective the scoring, the higher the reliability
75
Assessments - Validity
Are we measuring what we are supposed to be measuring Use the appropriate instrument for the knowledge, skill, or attitude you are testing The major types of validity should be considered Face Content Construct
76
Validity is accuracy Both archers are equally reliable
Archer 1 hits bulls eye every time Archer 2 hits outer ring in same spot every time The problem in educational measurement is that we can seldom “see” the target, e.g., attitudes, knowledge, professionalism as targets are “invisible targets” and validity claims are based on inferences and extrapolations from indirect evidence that we have hit the target. Both archers are equally reliable Validity = quality of archer’s hits
77
Reliability and Validity
Reliable and Valid Reliable, not valid Not reliable, not valid
78
Improving reliability and validity
Base assessment on outcome/objectives- event triggers- observable behavior- behavioral rating-assess Define: Low-medium-high performance Use of rubric or rating metric Use (video) training examples of performance Employ quality assurance/improvement system
79
Assessments types Choose the appropriate assessment method: Formative
Summative Self Peer
80
Assessment Formative Assessment Summative Assessment Lower stakes
One of several, over time of course or program May be evaluative, diagnostic, or prescriptive Often results in remediation or progression to next level Summative Assessment Higher stakes Generally final of course or program Primary purpose is performance measurement Often results in a “Go, No-Go” outcome
81
Assessments - self Encourages responsibility for the learning process, fosters skills in making judgments as to whether work is of an acceptable standard – it improves performance. Most forms of assessment can be adapted to a self-assessment format (MCQs, OSCEs, and short answers) Students must be aware of standards required for competent performance.
82
Assessments - peer Enables learners to hone their skills in their ability to work with others and professional insight Enables faculty to obtain a view of students they do not see An important part of peer assessment is for students to justify the marks they award to others Justification can also be used as a component when faculty evaluates attitudes and professionalism.
83
Assessments – Setting Standards
Should be set to determine competence Enables certification to be documented, accountable and defensible Appropriately set standards for an assessment will pass those students who are truly competent Standards should not be two low (false positives) to pass those who are incompetent, nor too high (false negative) to fail those who are competent.
84
Assessments – Setting Standards
Those responsible in setting standards must also have a direct role in teaching students at the level being examined and assist in providing examination material
85
Assessments – Setting Standards
Standards should be set around a core curriculum that includes the knowledge, skills and attitudes required of all students When setting a standard the following should be considered: What is assessed must reflect the core curriculum Students should be expected to reach a high standard in the core components of the curriculum (For instance an 80-90% pass mark of for the important core and 60-80% for the less important aspects.) Students should be required to demonstrate mastery of the core in one phase of the curriculum before moving on to the next part of the curriculum
86
Assessments - Feasibility
Is the administration and taking of the assessment instrument feasible in terms of time and resources The following questions should be considered: How long will it take to construct the instrument? How much time will be involved with the scoring process? Will it be relatively easy to interpret the scores and produce the results? Is it practical in terms of organization? Can quality feedback result from the instrument? Will the instrument indicate to the students the important elements within the course? Will the assessment have a beneficial effect in terms of student motivation, good study habits and positive career aspirations?
87
Practicality Number of students to be assessed
Time available for the assessment Number of staff available Resources/equipment available Special accommodations
88
Assessment Instruments
89
Assessments - Instruments
Be aware of the types of assessment instruments available as well as the advantages and disadvantages of each Use more than one assessment instrument and more than one assessor if possible when looking at skills and attitudes
90
Choosing appropriate assessment methods
When choosing the assessment instrument, the following should be answered: Is it valid Is it reliable Is it feasible
91
Assessments – Knowledge Instruments
Objective tests (short answer, true/false, matching, multiple choice) Objective Structured Clinical Evaluations (OSCEs) Constructed response questions Rating scales (used on clerkships)
92
Assessments – Skill Instruments
Objective tests (Simulation based) OSCEs Constructed response questions Critical reading papers (interpreting literature) Checklists Rating Scales Portfolios (self-evaluation, time management)
93
Weighted Checklists List of items to measure Set of weights of each
Summary score
94
Assessments – Attitude Instruments
Portfolios Essays / Modified essay questions OSCEs Checklists Rating scales Patient management problems Short/long case assessments
95
Assessment Metrics Procedural or Check List assessment
Global Rating assessment
96
Assessment Metrics Procedural or Check List assessment BCLS Y N
Open Airway Check Breathing BCLS Y N Open Airway (< 5 sec of LOC) Check Breathing (< 5 sec of Airway) BCLS Y N Open Airway Check Breathing A Rating Score +1 -1 *Assist
97
Assessment Metrics Global Rating assessment Code Blue P F CPR and ACLS
CPR <1(low) - 5(Hi)> points ACLS <1(low)- 5(Hi)> points Pts. Code Blue H M CPR ACLS L Rating Score +1 -1
98
Review Assessment drives learning
Clearly define the desired outcome, ensure that it can be measured Consider the effectiveness of the measurement Feedback to individual candidates Feedback to training programs
99
Questions and discussion
100
Good Luck
101
Competency Competency is noted when a learner is observed performing a task or function that has been established as a standard by the profession. The achievement of professional competency requires the articulation of learning objectives as observable, measurable outcomes for a specific level of learner performance. Such specific detailing of performance expectations defines educational competencies. They are verified on the basis of evidence documenting learner achievement, and must be clearly communicated to learners, faculty, and institutional leaders prior to assessment. ____________ Identified by members of work group on competency-based women’s health education at APGO Interdisciplinary Women’s Health Education Conference in September, 1996, Chantilly, VA.
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.