Benchmark Screening: What, Why and How

Benchmark Screening: What, Why and How
A module for pre-service and in-service professional development MN RTI Center Author: Lisa H. Stewart, PhD Minnesota State University Moorhead click on RTI Center

MN RTI Center Training Modules
This module was developed with funding from the MN legislature It is part of a series of modules available from the MN RTI Center for use in preservice and inservice training:

Overview This module is Part 1 of 2
Module 1: Benchmark Screening: What, Why and How What is screening? Why screen students? Criteria for screeners/what tools? Screening logistics Module 2: Using Benchmark Screening Data Split into 2 sections- could use for 2 different class sessions

Assessment: One of the Key Components in RTI
Curriculum and Instruction Assessment School Wide Organization & Problem Solving Systems (Teams, Process, etc) Adapted from Logan City School District, 2002

Assessment and Response to Intervention (RTI)
A core feature of RTI is identifying a measurement system Screen large numbers of students Identify students in need of additional intervention Monitor students of concern more frequently 1 to 4x per month Typically weekly Diagnostic testing used for instructional planning to help target interventions as needed 2 primary measurement needs; screening and progress monitoring. This module will focus on screening. There is another module specific to progress monitoring. Diagnostic testing expertise is also needed, but typically teachers are more familiar with tools and the process for assessment for instructional planning/diagnostic testing as the term is being used here, than with screening and progress monitoring. 5 5

Why Do Screening? Activity What does it mean to “screen” students?
Why is screening so important in a Response to Intervention system? (e.g., what assumptions of RTI require a good screening system?) What happens if you do NOT have an efficient, systematic screening system in place in the school? In large or small group discuss these … will probably come up with many of the items on the next slide… but here are some other possible responses: 1) the assumption of intervening earlier is better- need screening data to catch students earlier, also the assumption that all kids can learn thus making low scores on screeners a potential red flag for any student- we expect all students to get to targets. Also RTI is a proactive, data-based system (assumes we will make better decisions with data and doesn’t relay on teacher referrals) and screening is part of that 2) without screening you have a referral driven subjective system that is reactive and often biased (the opposite of some of the items in #1 and similar to the way many schools operate). You also don’t have a way to monitor the effects of your curriculum very easily or check to see if students who start the year are still on track at mid-year, etc. What does it mean to “screen” students? All students get some sort of test and the results of that test will indicate the likelihood they need special help. Screeners should maximize the likelihood of finding kids at risk who would benefit from intervention and minimize the number of kids identified as in need of help who really don’t need it. Why in RTI? RTI is not a referral driven system- based on assumptions of data-based decision making, including decisions about problem identification (who needs extra help?). Good screening data allows the school to be proactive, data-based and promote early intervention in a fair and equitable way. School wide decision making promotes the used of school-wide decision making where students needs determine the resources needed (assumption in RTI is that resources need not be evenly distributed among teachers or even grade levels but should go where the needs are the greatest). Screening data also promote a continuous improvement model where we can use data to see if ALL the students are benefitting (is there a rising tide that is raising all ships?) What happens if you do not have an efficient systematic screening system in place in the school?

Screening is part of a problem-solving system
Helps identify students at-risk in a PROACTIVE way Gives feedback to the system about how students progress throughout the year at a gross (3x per year) level If students are on track in the fall are they still on track in the winter? What is happening with students who started the year below target, are they catching up? Gives feedback to the system about changes from year to year Is our new reading curriculum having the impact we were expecting? 7 7

What Screening Looks Like in a Nutshell
School decides on brief tests to be given at each grade level and trains staff in the administration, scoring and use of the data Students are given the tests 3x per year (Fall, Winter, Spring) Person or team assigned in each building to organize data collection All students are given the tests for their grade level within a short time frame (e.g., 1-2 weeks or less). Some tests may be group administered, others are individually administered. Benchmark testing: about 5 minutes per student, desk to test (individually administered) Administered by special ed, reading, or general ed teachers or paras Entered into a computer/web based reporting system by clerical staff Reports show the spread of student skills and lists student scores, etc. to use in instructional and resource planning Not all schools do benchmarking exactly like this, but this is a pretty common scenario. Main point of this slide is to give some context of what we are talking about to people who have not been exposed to this in the schools much or at all.

Example Screening Data: Spring Gr 1 Oral Reading Fluency
10/51 (20%) high risk 22/51 (43%) some risk 19/51 (37%) low risk: on or above target Class lists then identify specific students (and scores) in each category Explain graph: This graph is based on CBM Oral reading fluency (# of words correct per minute in grade level passage) data of all the 1st graders in a school in the spring. It shows the spread of scores (from 0 to 4 up to 75+) and the number of students getting different scores along that continuum. Scores are divided into three categories based on some criterion cut points to help in giving some idea of the level of concern we should have about the student’s demonstrated skill. CBM ORF is a test of reading rate, but an excellent indicator of overall reading ability, including comprehension, so if a student is at risk on this measure it warrants taking another look at what else we know about this student and what extra intervention may be needed. Can also use thisslide tto discuss specificity, sensitivity, and do no harm later on in presentation… DRAFT May 27, 2009 9

Screening Data Gives an idea of what the range of student skills is like in your building and how much growth over time students are making May need to explain what the graph means (e.g., how this shows the range of scores and that this is from K to 10th grade and across an entire year of Fall, Winter, Spring testing, point out the legend that shows how to interpret the box and whiskers chart) DRAFT May 27, 2009

Screening Data can be linked to Progress Monitoring
The goal is to have a cohesive system. If possible, use the same measures for both screening and progress monitoring (e.g, CBM). Screen ALL students 3x per year (F, W, S) Strategic Support and Monitoring Students at Some Risk What do we mean by cohesive system? Ideally the progress monitoring data should be linked to other types of data and data collection in the school. For example if the school does universal screening(benchmarking) it is very nice if those same measures can be used for progress monitoring. Note that the graphic represents the 3 tiers of RTI Tier 1 (green), Tier 2 (yellow), Tier 3 (red). See RTI Overview and other modules for more explanation of the Tiers if the audience doesn’t easily see this connection. What do we mean by CBM General Outcome Measures? See General Outcome Measures CBM module. GOMs can be used to screen all students in the grade level 3x per year and they can also be used to do weekly progress monitoring with students who are at-risk. Note: Other types of information (diagnostic information, high stakes tests, etc) are not good for progress monitoring but obviously also have their own purpose and place in the school’s decision making process I like this slide here. I don’t think cohesive system is too technical. If anything, I would say converging data is more technical. I think that if you write down here in the notes what you mean by each, I think the facilitator will be able to understand and explain to the students what is meant by this slide. Intensive Support & Monitoring for Students at Extreme Risk 11

A Smart System Structure
School-Wide Systems for Student Success Academic Systems Behavioral Systems Intensive, Individual Interventions Individual Students Assessment-based High Intensity Of longer duration Intensive, Individual Interventions Individual Students Assessment-based Intense, durable procedures 5-10% 5-10% 10-15% 10-15% Targeted Group Interventions Some students (at-risk) High efficiency Rapid response Targeted Group Interventions Some students (at-risk) High efficiency Rapid response Screening provides a way to make data based decisions about how many students are on target and how many need additional instruction. It can tell you if your triangle is even close to the RTI “ideal” shown here, and provide a list of students to consider for Tier 2 and Tier 3 services…. It can also indicate that you have a lot of work to do in Tier 1…  75-85% Universal Interventions All students Preventive, proactive All settings, all students

Terminology Check Screening Universal Screening Benchmarking
Collecting data on all or a targeted group of students in a grade level or in the school Universal Screening Same as above but implies that all students are screened Benchmarking Often used synonymously with the terms above, but typically implies universal screening done 3x per year and data are interpreted using criterion target or “benchmark” scores

“Benchmark” Screening
Schools typically use cut off or criterion scores to decide if a student is at-risk or not. Those scores or targets are also referred to as “benchmarks”, thus the term “benchmarking” Some states or published curriculum also use the term benchmarking but in a different way (e.g., to refer to the documentation of achieving a specific state standard) that has nothing to do with screening. Information on where cut off scores come from provided later in the module

What to Measure for Screening? Create a “Measurement Net”:
A measurement net simply lays out what tests are given to which grade levels at what time in the areas you wish to screen (e.g., reading, math, writing, behavior….) The measurement net allows you to make sure you have a plan in place and also reminds you to look at the logical and empirical “sense” behind the sequence and types of tests you are planning to give to students over time.

How do you decide what Measures to Use for Screening?
Lots of ways to measure reading in the schools: Measure of Academic Progress (MAP) Guided Reading (Leveled Reading) Statewide Accountability Tests Published Curriculum Tests Teacher Made Tests General Outcome Measures (Curriculum-Based Measurement “family”) STAR Reading Etc Not all of these are appropriate. Some are not reliable enough for screening, others are designed for another purpose and are not valid or practical for screening all students 3x per year Several of these are not appropriate- statewide tests, any test with reliability below r=.80 which includes guided reading as typically done, curriculum tests, and teacher made tests and some of the CBM tests (e.g., DIBELS word use fluency, and initial sound fluency as it is typically done), there are tests that are impractical to use 3x per year (for example the MAP can be given 3x per year but most schools do not do that because of the student time/computer lab time needed to do so) or that are not valid for screening decisions… there are some tests that are appropriate for screening that are not mentioned here. Main point is to remind people that not all tests are appropriate for screening, even if the authors/marketers say so…. They may be valid for other purposes, but not for screening See Measurement Overview for an expanded explanation and activites related to the purposes of assessment and characteristics of an effective measurement system for RTI in general and screening in particular

Characteristics of An Effective Measurement System for RTI
valid reliable simple quick inexpensive easily understood can be given often sensitive to growth over short periods of time If students don’t have a background in reliability or validity of measurement at all may want to discuss these briefly- standards for reliability for screening for individual students are reliability estimates of .80 or higher, preferably .90+ Credit: K Gibbons, M Shinn

Effective Screening Measures
Specific Identifies at risk students who really are at risk Sensitive Students who “pass” really do go on to do well Practical Brief and simple (cheap is nice too) Do no harm If a student is identified as at risk will they get help or is it just a label? Reference: Hughes & Dexter, RTI Action Network

Buyer Beware! Many tools may make claims about being a good “screener”
Need to find evidence and make sure it really is reliable and valid for SCREENing as well as practical 19

Measurement and RTI: Screening
Reliability coefficients of at least r =.80. Higher is better, especially for screening specificity. Well documented predictive validity Evidence the criterion (cut score) being used is reasonable and creates not too many false positives (students identified as at risk who aren’t) or false negatives (students who are at risk who aren’t identified as such) Brief, easy to use, affordable, and results/reports are accessible almost immediately See modules on Benchmark Screening in RTI for more related information, information on reliability for sensitivity vs. specificity, etc.

National Center for RTI Review of Screening Tools
Provides information about center criteria and the link about the center for progress monitoring Main point here is that there is an easily available resource- the national center for progress monitoring, that has information on progress monitoring and has reviews of different tests and test providers and whether they meet these criteria. Can then click on the tests to get more information…. Suggestion: If time allows and you have an internet link, go to the web site and explore the web site for a few minutes to show the students what is on the web site. Note: Only reviews tests submitted, if it is not on the list it doesn’t mean it is bad, just that it wasn’t reviewed 21

RTI, General Outcome Measures and Curriculum Based Measurement
Many schools use Curriculum Based Measurement (CBM) general outcome measures for screening and progress monitoring You don’t “have to “ use CBM, but many schools do Most common CBM tool in Grades 1- 8 is Oral Reading Fluency (ORF) Measure of reading rate (# of words correct per minute on a grade level passage) and a strong indicator of overall reading skill, including comprehension Early Literacy Measures are also available such as Nonsense Word Fluency (NWF), Phoneme Segmentation Fluency (PSF), Letter Name Fluency (LNF) and Letter Sound Fluency (LSF) See module on CBM for more in depth explanation and examples 22

Why GOMs/CBM? Typically meet the criteria needed for RTI screening and progress monitoring Reliable, valid, specific, sensitive, practical Also, some utility for instructional planning (e.g., grouping) They are INDICATORS of whether there might be a problem, not diagnostic! Like taking your temperature or sticking a toothpick into a cake Oral reading fluency is a great INDICATOR of reading decoding, fluency and reading comprehension Fluency based because automaticity helps discriminate between students at different points of learning a skill 23

GOM…CBM… DIBELS… AIMSweb…
General Outcome Measures (GOM) is the general term to describe the CBM “family” of measurement tools that all share the characteristics of being reliable, valid indicators of overall or general academic progress and being sensitive and practical (simple, brief, easy to understand, inexpensive, etc). Originating at the University of Minnesota with research by Stan Deno and colleagues in the 70s, there are several CBM tasks in a variety of academic areas. See the CBM module for more in depth information. In the 1990s Roland Good and Ruth Kaminski at the University of Oregon developed the DIBELS (Dynamic Indicators of Basic Early Literacy Skills) that have the same characteristics but were designed to measure students early literacy skills. AIMSweb is a web-based data management system that provides training and student materials and a way to enter data and get reports via that web that uses CBM measurement tools. It all gets a little confusing! But all of these tools share the same basic characteristics and the same philosophy of the importance of being able to use the data for both summative (e.g., screening) and formative (progress monitoring) purposes. DRAFT May 27, 2009 24 24

CBM Oral Reading Fluency
Give 3 grade-level passages using standardized admin and scoring; use median (middle) score 3-second rule (Tell the student the word & point to next word) Discontinue rule (after 0 correct in first row, if <10 correct on 1st passage do not give other passages) Errors Not Errors Hesitation for >3 seconds Incorrect pronunciation for context Omitted Words Words out of order Repeated Sounds Self-Corrects Skipped Row Insertions Dialect/Articulation 25

Fluency and Comprehension
The purpose of reading is comprehension A good measures of overall reading proficiency is reading fluency because of its strong correlation to measures of comprehension. It is understandable that CBMs and GOMs correlate so strongly with test of reading comprehension if one understands the reading process. Note that CBM oral reading fluency is actually a measure of oral reading RATE since it does not include a way to factor reading with expression/prosodic features into the actual score. For more detailed measurement information about CBM and why it is so often used in RTI systems, see the CBM module.

Screening Logistics What materials? When to collect? Who collects it?
How to enter and report the data?

What Materials? Use computer or PDA-based testing system -OR-
Download reading passages, early literacy probes, etc. from the internet Many sources of CBM materials available free or low cost: Aimsweb, DIBELS, edcheckup, etc. Often organized as “booklets” for ease of use Can use plastic cover and markers for scoring to save copy costs

Screening Materials in K and Gr 1
Screening Measures will change from Fall to Winter to Spring slightly Early literacy “subskill” measurement is dropped as reading develops Downloaded materials and booklets Screening measures shift as reading develops- for example drop alphabetic principle measure (e.g., LSF or NWF) and phonemic awareness measure (e.g., PSF) once students are reading in text enough to get reliabe and Valid indicators of overall reading and not these more “subskill” oriented measures

K and Gr 1 Measures AIMSweb Early Literacy and R-CBM(ORF)
Example of how the measures do shift over time in K and 1. Note that AIMSweb calls CBM Oral reading fluency R-CBM General Literacy Risk Factor= Black, Alphabetic Principle = Green Phonemic Awareness = Purple, Vocabulary = Blue Fluency with Connected Text & Comprehension= Red

Gr 2 to 12: AIMSweb Early Literacy and CBM Measures

Screening Logistics: Timing
Typically 3x per year: Fall, Winter, Spring Have a district-wide testing window! (all grades and schools collect data within the same 2 week period) In Fall K sometimes either test right away and again a month later or wait a little while to test Benchmark testing: about 5 minutes per student (individually administered) In the classroom In stations in a commons area, lunchroom, etc. The variation in Fall of K is due to the fact that some children do “take off” when they get to Kinder and others don’t so the predictive validity of the measures is not as strong.

Screening Logistics: People
Administered by trained staff paras, special ed teachers, reading teachers, general ed teachers, school psychologists, speech language, etc. Good training is essential! Measurement person assigned in each building to organize data collection Either collected electronically or entered into a web-based data management tool by clerical staff Good training and fidelity checks really are important!

Screening Logistics Math Quiz 
If you have a classroom with 25 students and to administer the screening measures takes approx. 5 min. per student (individual assessment time)… How long would it take 5 people to “screen” the entire classroom?

Remember: Garbage IN…. Garbage OUT….
Make sure your data are reliable and valid indicators or they won’t be good for nuthin… Training Assessment Integrity checks/refreshers Well chosen tasks/indicators If you choose unreliable or invalid screening tools this will be a waste of time. If you don’t do a good job of training and ensuring high quality data collection and entry teachers will not trust the data and use it.

Use Technology to Facilitate Screening

Using Technology to Capture Data
Collect the data using technology such as a PDA Example: Students take the test on a computer Example: STAR Reading

Using Technology to Organize and Report Data
Enter data into web-based data management system Data gets back into the hands of the teachers and teams quickly and in meaningful reports for problem solving Examples If time and internet access allows could go to a site and do a demo OR give this an an assignment.

Screening is just one part of an overall assessment system for making decisions
This is just an example of a “decision tree” created by a school district as a guideline for streamlining their assessments with students and being proactive. Possible Activity: Give students the 2 sided handout with this flowchart on one side and the example of 3 students scores and have them decide what each student’s screening data mean and what the next step would be. 39

Remember: Screening is part of a problem-solving system
Helps identify students at-risk in a PROACTIVE way Gives feedback to the system about how students progress throughout the year at a gross (3x per year) level If students are on track in the fall are they still on track in the winter? What is happening with students who started the year below target, are they catching up? Gives feedback to the system about changes from year to year Is our new reading curriculum having the impact we were expecting? Review from beginning of module… purpose of screening in RTI

Build in Time to USE the Data!
Schedule data “retreats” or grade level meeting times immediately after screening so you can look at and USE the data for planning. Main point: It is easy to collect data and not use it. Many schools and teachers build in specific times right after screening days to sit down and look at and use the data.

Common Mistakes Not enough professional development and communication about why these measures were picked, what the scores do and don’t mean, the rationale for screening, etc Low or questionable quality of administration and scoring Too much reliance on a small group of people for data collection Teaching to the test Limited sample of students tested (e.g., only Title students! ) Slow turn around on reports Data are not used Also see more complete list of “common mistakes and how to avoid them” handout included with this module.

Using Screening Data: See Module 2!
See MN RTI Center Benchmark Module 2!

Articles available with this module
Stewart & Silberglitt. (2008). Best practices in developing academic local norms. In A. Thomas & J. Grimes (Eds.) Best Practices in School Psychology, V, NASP Publications.(pp ). NCRLD RTI Manual (2006). Chapter 1: School-wide screening Retrieved from 6/26/09 Jenkins & Johnson. Universal screening for reading problems: Why and how should we do this? Retrieved 6/23/09, from RTI Action Network site: Kovaleski & Pederson (2008) Best practices in data analysis teaming. In A. Thomas & J. Grimes (Eds.) Best Practices in School Psychology, V, NASP Ikeda, Neessen, & Witt. (2008). Best practices in universal screening. In A. Thomas & J. Grimes (Eds.) Best Practices in School Psychology, V, NASP Publications.(pp ). Gibbons, K (2008). Necessary Assessments in RTI. Retrieved from on 6/26/09 44

RTI Related Resources Research Institute on Progress Monitoring
National Center on RTI RTI Action Network – links for Assessment and Universal Screening MN RTI Center and click on link National Center on Student Progress Monitoring Research Institute on Progress Monitoring

RTI Related Resources (Cont’d)
National Association of School Psychologists National Association of State Directors of Special Education (NADSE) Council of Administrators of Special Education Office of Special Education Programs (OSEP) toolkit and RTI materials

Key Sources for Reading Research, Assessment and Intervention…
University of Oregon IDEA (Institute for the Development of Educational Achievement) Big Ideas of Reading Site Florida Center for Reading Research Texas Vaughn Gross Center for Reading and Language Arts American Federation of Teachers Reading resources (what works 1999 publications) National Reading Panel

Recommended Sites with Multiple Resources
Intervention Central- by Jim Wright (school psych from central NY) Center on Instruction St. Croix River Education District

Quiz 1.) A core feature of RTI is identifying a(n) _________ system.
2.) Collecting data on all or a targeted group of students in a grade level or in the school is called what? A.) Curriculum B.) Screening C.) Intervention D.) Review Answers: 1.) A core feature of RTI is identifying a(n) _________ system. measurement or assessment 2.) Collecting data on all or a targeted group of students in a grade level or in the school is called what? B.) Screening

Quiz (Cont’d) 3.) What is a characteristic of an efficient measurement system for RTI? A.) Valid B.) Reliable C.) Simple D.) Quick E.) All of the above Answers: 3.) What is a characteristic of an efficient measurement system for RTI? E.) All of the above

Quiz (Cont’d) 4) Why screen students?
5) Why would general education teachers need to be trained on the measures used if they aren’t part of the data collection? Answers (Cont’d) 4.) Why screen students? -So all students get some sort of test and the results of that test will indicate the likelihood they need special help. Screeners should maximize the likelihood of finding kids at risk who would benefit from intervention and minimize the number of kids identified as in need of help who really don’t need it. 5.) Why would general education teachers need to be trained on the measures used if they aren’t part of the data collection? -To allow them to interpret the scores, use the scores when making educational decisions for their students, and understand the progress

Quiz (Cont’d) 6) True or False? If possible the same tools should be used for screening and progress monitoring. 7.) List at least 3 common mistakes when doing screening and how they can be avoided. Answers (cont’d) 6.) True or False? If possible the same tools should be used for screening and progress monitoring. -True 7.) List at least 3 common mistakes when doing screening and how they can be avoided. -Not enough professional development and communication about why these measures were picked, what the scores do and don’t mean, the rationale for screening, etc., low or questionable quality of administration and scoring, too much reliance on a small group of people for data collection, teaching to the test, limited sample of students tested (e.g., only Title students! , slow turn around on reports, or data are not used.

The End  Note: The MN RTI Center does not endorse any particular product. Examples used are for instructional purposes only. Special Thanks: Thank you to Dr. Ann Casey, director of the MN RTI Center, for her leadership Thank you to Aimee Hochstein, Kristen Bouwman, and Nathan Rowe, Minnesota State University Moorhead graduate students, for editing work, writing quizzes, and enhancing the quality of these training materials

Benchmark Screening: What, Why and How

Similar presentations

Presentation on theme: "Benchmark Screening: What, Why and How"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Benchmark Screening: What, Why and How

Similar presentations

Presentation on theme: "Benchmark Screening: What, Why and How"— Presentation transcript:

Similar presentations

About project

Feedback