Capacity and Fidelity Assessments: Advancements in Tools for Schools

Capacity and Fidelity Assessments: Advancements in Tools for Schools
Christine Russell, Ed.D. Evaluation and Research Specialist for MiBLSi Caryn Sabourin Ward, Ph.D. Sr Implementation Specialist/Scientist

Learning Objective Understand innovative practices applied to the validation of: Capacity Assessments Regional Capacity Assessment (RCA) District Capacity Assessment (DCA) Fidelity Assessment Reading Tiered Fidelity Inventory (R-TFI)

Regional Capacity Assessment (RCA)
Used by Regional Education Agencies at least 2 times a year Assessment of a regional education agency’s systems, activities and resources necessary for the regional education agency to successfully support district-level implementation of Effective Innovations District Capacity Assessment (DCA) Used by District Implementation Teams at least 2 times a year Assessment of the district’s systems, activities and resources necessary for schools to successfully adopt and sustain Effective Innovations Reading Tiered Fidelity Inventory (R-TFI) Elementary and Secondary Versions Fidelity Assessment Used by School Implementation Teams at least annually Assess the implementation of a School-Wide Reading Model encompassing (1) evidence-based practices focused on the Big Ideas of Reading (2) systems to address the continuum of reading needs across the student body and (3) data use and analysis

Regional Capacity Assessment (RCA) District Capacity Assessment (DCA)

Reading Tiered Fidelity Inventory (R-TFI)
Elementary Version

Administration Process
Self-assessment completed by a team with a trained external administrator and local facilitator 1-2 hours in length Consensus scoring

Example Item and Rubric

Consider How have you experienced enhanced action planning through the use of any of the following NIRN Capacity Measures? State Capacity Assessment (SCA) Regional Capacity Assessment (RCA) District Capacity Assessment (DCA) Drivers Best Practices Assessment How have you experienced enhanced action planning through the use of fidelity assessments?

Focus on Content Validation
Confident the assessment correlates with positive outcomes Quality to which the underlying construct is measured Accuracy and meaning of the assessment results Without validation, a self-assessment is more of a checklist or support tool rather than a measure that we are confident in

Validation Process Classic Model Modern Approach Validity Test Content
Response Process Internal Structure Relationship to Other Variables Consequence of Testing Content Criterion Construct Validity

Example Methodologies
Sources of Validity Description Example Methodologies Test Content Instrument characteristics such as themes, wording, format of items, tasks, questions, instructions, guidelines and procedures for administration and scoring Basis for items/literature review Qualification of authors and reviews Item writing process Review by panel of experts Vetting and editing process Response Process Fit between the items and process engaged in by those using the assessment Think Aloud Protocols Internal Structure Considers the relationships among items and test components compared to test constructs Factor and Rasch analysis Relationship to Other Variables Relationship of test scores to variables external to the test Relationship between a test score and an outcome Predictive evidence Concurrent evidence Convergent evidence Divergent Consequence of Testing Intended and unintended consequences of test use Purpose, use and outcomes of test administration including arguments for and against Sources of Validity Rational for focusing on test content: Test content represents the extent to which the items adequately sample the construct. (Gable and Wolf 1994) Gathering evidence of test content establishes the appropriateness of the conceptual framework and how well the items represent the construct (Sireci & Faulkner-Bond, 2014)

Construct Definition, Item Generation
Phase 1 Construct Definition, Item Generation Phase 2 Test Content Validation - Survey Protocol Phase 3 Response Process Validity - Think Aloud Protocol Phase 4 Usability and Refinement

Construct Definition, Item Construction
Experts utilized: Previous iterations of similar assessments Feedback from administrators and practitioners who had experience with similar assessments RCA/DCA: Advancements within implementation science and systems change R-TFI: Advancements within the field of school-wide reading practices

Content Validity Survey Elements Suggested by Haynes et. al, 1995
Array of items selected (questions, codes, measures) Precision of wording or definition of individual items Item response from (e.g. scale) Sequence of items or stimuli Instructions to participants Temporal parameters of responses (interval of interest; timed vs. untimed) Situations sampled Behavior or events sampled Components of an aggregate, factor, response class Method and standardization of administration Scoring, data reduction, item weighting Definition of domain and construct Method-mode match Function-instrument match Array of items selected (questions, codes, measures) Precision of wording or definition of individual items Item response from (e.g. scale) Sequence of items or stimuli Instructions to participants Temporal parameters of responses (interval of interest; timed vs. untimed) Situations sampled Behavior or events sampled Components of an aggregate, factor, response class Method and standardization of administration Scoring, data reduction, item weighting Definition of domain and construct Method-mode match Function-instrument match

4-Part Content Validation Survey Protocol
Section #1: Consent and Edits Consent Form and Opt in/out of listing as contributor Downloadable word version of the assessment Upload assessment with edits, suggestions, questions provided through track changes Section #2: Item Analysis Attainability and Importance of each item rated on a 3-point scale Opportunity to select the 5 most critical items Section #3: Construct Comprehensiveness and clarity of each construct definition rated on a 3-point scale Open-ended comments on construct definition Best fit for each item with an Implementation Driver or area of a School-Wide Reading Model Section #4: Sequencing Frequency and Format Suggestions for reordering items Suggestions for frequency of administered Comprehensiveness and clarity of each section rated on a 3-point scale Open-ended comments on sections of the assessment If participant had previous experience administering a similar assessment asked: Whether current version is an improvement from previous version(s) Input on what benefits have been experienced using similar assessments in the past

Survey Participants Total Researchers/National Technical Assistance Providers State/Regional Technical Assistance Providers District Practitioners RCA 23 4 15 DCA 34 19 11 R-TFI Elementary 10 6 - The number of participants suggested for a content validation survey varies from 2-20 (Gable & Wolf, 1993; Grant & Davis, 1997; Lynn, 1986; Tilden, Nelson, & May, 1990; Waltz, Strickland, & Lenz, 1991).

Minutes Spent Completing Survey
Survey #1: Consent and Edits Survey #2: Item Analysis Survey #3: Construct Survey #4: Sequencing Frequency and Format Total RCA Average 162 Range (70 – 300) 23 ( ) 27 (6 – 50) 18 (5 - 45) 230 ( ) DCA 89 (23 – 200) 26 ( ) (5 - 75) 20 (6 - 60) 157 ( ) R-TFI Elementary 99 (5 – 180) 12 (5 - 20) 19 ( ) 9 (5 - 25) 135 ( )

Content Validation Did we improve the assessment compared to comparable or previous assessments? Are the definitions of the constructs clear and useable? How frequently should this be used to assess? Are the sections of the assessment comprehensive and clear? Item Analysis Does the item fit the content domain? How relevant/important is the item for the domain? What edits are needed to the item and rubric?

DCA Content Validation Results – Improvements Compared to Other Measures
76.5% (n=16) of respondents had previously completed a similar assessment Level of improvement = 8 (on scale of 0-10) Described the DCA as: Streamlined Shorter, more concise items Improved due to use of a rubric

DCA Content Validation Results – Construct Definitions
Decision Cut Decision Rule Average rating of less than 2.5 for comprehensiveness or clarity Results Will revise definitions based on comments Comprehensive Met Threshold Clarity Met Threshold Revisions Capacity Yes No revisions Competency Leadership No Rewrote definition for increased clarity Organization

DCA Content Validation Results – Frequency of Assessment
Decision Cut Decision Rule More than 70% of respondents suggest one option for frequency Results Use the recommendation as suggested frequency Did not meet criteria Used majority response = assess twice annually Frequency – comments that at later stages of implementation, less assessment may be appropriate.

DCA Content Validation Results – Comprehensive and Clear Sections
Decision Cut Decision Rule Average rating of less than 2.5 for comprehensiveness or clarity Will revise sections based on comments Results All sections met the threshold for both comprehensiveness and clarity. Sections were revised based on feedback within the track changes documents to increase ease of use and correct any edits needed

DCA Content Validation Results – Item Analysis
Decision Cut Decision Rule Importance 2.5 Content Validity Index (CVI) At or Above Eliminate or substantially change the item Below Decision to accept an edit or address a comment/question made based on whether the suggestions enhance the clarity of the item # of Times Rated as Top 5 Most Important Used to further validate CVI rating Attainability Less than 1.5 CVI Develop an action plan to create resources to assist teams with action planning and attaining item CVI – Content Validity Index – Average of scores.

Importance - 1 item below 2.5 threshold Attainability - 11 items below 2.5 threshold Item revisions Combined 2 items Deleted 1 item Edits occurred within the rubric for each item based on suggestions

Consider Reflect on the low scores on attainability for items related to the Competency Driver. How does this matches with what you see in your work with teams and schools? How does this finding relate to the work that you are doing to develop supports and resources for teams to sustain their work?

DCA Content Validation Results – Item Match with Constructs
DCA Decision Rules and Results Decision Cut Decision Rule Over 70% of respondents align an item with a construct Item will be housed within that construct Less that 70% of respondents align an item within one clear construct Authors will use results, comments and personal knowledge of the constructs to map an item to a construct Results Met 70% criteria = 3 items 50%-70% aligned with on construct = 20 items Below 50% = 3 items Decision = Use author knowledge along with comments to map items to constructs

DCA Content Validation Results – Sequencing of Items
Decision Cut Decision Rule More than 50% of respondents suggest moving an item Consider a revised location/order for the item Results 77%, no reordering suggestions. Minor reordering occurred based on comments and due to assessment edits

Think Aloud Procedure “We are going to ask you to read portions of the document aloud. The purpose of reading aloud is to ensure clarity and ease of reading the measure. This process will allow us to capture any areas where wording needs to be adjusted. As you read please verbalize any thoughts, reactions, or questions that are running through your mind. Please act and talk as if you are talking to yourself and be completely natural and honest about your rating process and reactions. Also, feel free to take as long as needed to adequately verbalize.” Exerpt from the Think Aloud protocol. Participants: RCA 4, DCA 4, R-TFI 3.

Changes to R-TFI from Think Aloud
Rewording of introduction and directions Added before, during and after administration sections Changed from the wording from “Subscales” to “Tiers” Changed sequence of items in the Tier 2 section Included additional words in the glossary Found some issues with consistency of terms Identified items that are not applicable to school decision making and instead should be asked at a district level

Usability Testing Type of Improvement Cycle based on PDSA-Cycle (Deming) Plan – What did you intend to do? Do - Did you do it? Study – What happened? Act – What can be changed and improved? Cycle

Usability Testing A planned series of tests to refine and improve the assessment and the administration processes. Used proactively to test the feasibility and impact of a new way of work prior to rolling out the assessment or administration processes more broadly More is learned from 4 cycles with 5 participants each than from 1 pilot test with 20 participants Cycle Cycle The idea is to use the PDSA processes with small groups of 4 or 5 administrations. Plan to use Capacity Assessment Do – Engage in training and conducting the assessments Study – Debrief as team and identify successes, changes and improvements needed on process of utilizing tool (training, introducing to respondents, process of conducting the assessments, using the results etc.) Act - Apply those changes to the next set of users. Repeat the process (PDSA) with the the next set of administrations 4 or 5 times.

Usability Testing Assessed 5 areas with respective goals:
Communication & Preparation Administration Protocol Items & Scoring Rubric Participant Response Training Implications Data collected via survey from administrators and facilitators for each administration in each cohort.

Usability Testing: RCA
Cohort 1: N = 4 administrations across 2 states Cohort 2: N = 5 administrations across 3 states Cohort 3: N = 6 administrations across 3 states Cohort 4: N = 6 administrations across 4 states Core Team

Results of Usability Testing for the RCA
Number of improvements in each of the five areas decreased over the cycles and all goals were met Area Examples of Improvement Communication & Preparation More guidance developed around team composition and respondents Administration Protocol 100% on fidelity protocol and rating of importance (4 or higher) Items & Scoring Rubric Minor wording changes to items; sequencing of items was reviewed but not changed Training Implications Facilitation skills identified; prioritization of areas for action planning Participant Response Engaged and positive

Consider What barriers and facilitators have you encountered when engaging in usability testing? How has usability testing helped to refine your measurement development work? How has usability testing helped to identify areas of strength and gap areas in your assessments?

Improvements to Validity Process
Use of track changes within the actual assessment tool as a method for providing feedback Clear decision rules for item revisions Lengthy survey broken down into manageable segments Use of Response Process/Think Aloud Protocol to further refine assessment Usability testing used to refine and improve measurement tool and assessment processes

Lessons Learned Content Validation Survey Think Aloud
Organize feedback into “Quick Edits” , “Questions” , “Comments” Track and report positive comments Think Aloud Have participants begin at different sections so fatigue doesn’t impact quality of feedback on later questions/sections of the tool PDSA Cycles – Usability Testing Requires: Discipline to have a plan and stick to it Studying and acting (plan-.‐do, plan-.‐do, plan-.‐do = my colleague saying …) Repetition until the goal is reached or the problem is solved (my other colleague says - often thwarted, never stymied)

Importance of Content Validation and PDSA in Assessment Development
Content Validation Survey Critical first step in validation of a measure Think Aloud Creates a well edited product prior to publishing PDSA Cycles – Usability Testing: Start Small and Get Better before Extensive Roll Out Very efficient way to develop, test, and refine the measure and its use in practice

More Information MiBLSi http://miblsi.cenmi.org/
Evaluation and Measurement Page DCA Technical Manual Active Implementation Hub Open Access Learning

Citation and Copyright
This document is based on the work of the National Implementation Research Network (NIRN). © Allison Metz, Leah Bartley, Jonathan Green, Laura Louison, Sandy Naoom, Barbara Sims, and Caryn Ward This content is licensed under Creative Commons license CC BY-NC-ND, Attribution-NonCommercial-NoDerivs . You are free to share, copy, distribute and transmit the work under the following conditions: Attribution — You must attribute the work in the manner specified by the author or licensor (but not in any way that suggests that they endorse you or your use of the work); Noncommercial — You may not use this work for commercial purposes; No Derivative Works — You may not alter, transform, or build upon this work. Any of the above conditions can be waived if you get permission from the copyright holder. web: The mission of the National Implementation Research Network (NIRN) is to contribute to the best practices and science of implementation, organization change, and system reinvention to improve outcomes across the spectrum of human services.

Capacity and Fidelity Assessments: Advancements in Tools for Schools

Similar presentations

Presentation on theme: "Capacity and Fidelity Assessments: Advancements in Tools for Schools"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Capacity and Fidelity Assessments: Advancements in Tools for Schools

Similar presentations

Presentation on theme: "Capacity and Fidelity Assessments: Advancements in Tools for Schools"— Presentation transcript:

Similar presentations

About project

Feedback