CHAPTER 1 Assessment in Social and Educational Contexts (Salvia, Ysseldyke & Bolt, 2012) Dr. Julie Esparza Brown SPED 512: Diagnostic Assessment Winter.

Slides:



Advertisements
Similar presentations
Progress Monitoring: Data to Instructional Decision-Making Frank Worrell Marley Watkins Tracey Hall Ministry of Education Trinidad and Tobago January,
Advertisements

Parent and Educator Information Dyslexia
Research-Based Instruction in Reading Dr. Bonnie B. Armbruster University of Illinois at Urbana-Champaign Archived Information.
Evaluation of the Iowa Algebra Aptitude Test Terri Martin Doug Glasshoff Mini-project 1 June 17, 2002.
What is a Good Test Validity: Does test measure what it is supposed to measure? Reliability: Are the results consistent? Objectivity: Can two or more.
Chapter 9: Fluency Assessment
Academic Data for Instructional Decisions: Elementary Level Dr. Amy Lingo, Dr. Nicole Fenty, and Regina Hirn Project ABRI University of Louisville Project.
Chapter Fifteen Understanding and Using Standardized Tests.
Assessment Area: Oral & Written Language. Assessment of language competence should include evaluation of a student’s ability to process, both comprehension.
Chapter 4 Validity.
STAR Basics.
Classroom Assessment A Practical Guide for Educators by Craig A
Evaluating a Norm-Referenced Test Dr. Julie Esparza Brown SPED 510: Assessment Portland State University.
 “Fluency assessment consists of listening to students read aloud and collecting information about their oral reading accuracy, rate, and prosody.” (Page.
Test Validity S-005. Validity of measurement Reliability refers to consistency –Are we getting something stable over time? –Internally consistent? Validity.
Reading First Assessment Faculty Presentation. Fundamental Discoveries About How Children Learn to Read 1.Children who enter first grade weak in phonemic.
Assessment Procedures for Counselors and Helping Professionals, 7e © 2010 Pearson Education, Inc. All rights reserved. Chapter 9 Assessment of Achievement.
Assessment of Academic Achievement Chapter Eight.
Adolescent Literacy – Professional Development
Reading Assessments for Elementary Schools Tracey E. Hall Center for Applied Special Technology Marley W. Watkins Pennsylvania State University Frank.
How to Interpret Test Scores. 1. What are standardized tests?  A standardized test is one that is administered under standardized or controlled conditions.
Diagnostics Mathematics Assessments: Main Ideas  Now typically assess the knowledge and skill on the subsets of the 10 standards specified by the National.
LECTURE 06B BEGINS HERE THIS IS WHERE MATERIAL FOR EXAM 3 BEGINS.
RDG 567 & RDG 568 (East Lyme Cohort) Session 3.
Elise Hardin & Erika Kroskos
SLD Academy 2.0 Houston Independent School District.
TACL-3 Test for Auditory Comprehension of Language
The New English Curriculum September The new programme of study for English is knowledge-based; this means its focus is on knowing facts. It is.
Language and Content-Area Assessment Chapter 7 Kelly Mitchell PPS 6010 February 3, 2011.
◦ Demographics  Grades K and 1  130 Kindergarten students  166 First Grade students  51% Economically Disadvantaged  29% Title 1  15%
Instrumentation (cont.) February 28 Note: Measurement Plan Due Next Week.
Wechsler Individual Achievement Test-II Overview.
Diagnostics Mathematics Assessments: Main Ideas  Now typically assess the knowledge and skill on the subsets of the 10 standards specified by the National.
Teaching Today: An Introduction to Education 8th edition
Chapter 5 Informal Assessment: Progress Monitoring.
Issues in Selecting Assessments for Measuring Outcomes for Young Children Issues in Selecting Assessments for Measuring Outcomes for Young Children Dale.
Standards-Based Assessment Overview K-8 Fairfield Public Schools Fall /30/2015.
Chapter 2 ~~~~~ Standardized Assessment: Types, Scores, Reporting.
Chapter 7 School Performance. Purposes for Assessing School Performance Evaluate the achievement status of an entire school population Determine the need.
Author(s)Donald D. Hammill & Stephen C. Larsen,(1996) defined Written Language as : “The term written language refers to the comprehension and expression.
Diagnostic Assessment: Salvia, Ysseldyke & Bolt: Ch. 1 and 13 Dr. Julie Esparza Brown Sped 512/Fall 2010 Portland State University.
Assessing Learners with Special Needs: An Applied Approach, 6e © 2009 Pearson Education, Inc. All rights reserved. Chapter 5: Introduction to Norm- Referenced.
Dyslexia: To Screen or Not to Screen Wendy Stovall, Ed.S., Keri Horn, Ed.S., Amber Broadway, Ed.S., & Mary Bryant, Ed.S. Crowley’s Ridge Education Service.
Assessment Procedures for Counselors and Helping Professionals, 7e © 2010 Pearson Education, Inc. All rights reserved. English Language Learners Assessing.
Chapter 6 - Standardized Measurement and Assessment
The Normal Distribution and Norm-Referenced Testing Norm-referenced tests compare students with their age or grade peers. Scores on these tests are compared.
Chapter 3 Selection of Assessment Tools. Council of Exceptional Children’s Professional Standards All special educators should possess a common core of.
Literacy Assessments Pamela Chapin. Items to be Covered Introductions Need and purpose of assessments Assessment types Validity and Reliability Testing.
Standardized Testing EDUC 307. Standardized test a test in which all the questions, format, instructions, scoring, and reporting of scores are the same.
Creative Curriculum and GOLD Assessment: Early Childhood Competency Based Evaluation System By Carol Bottom.
Understanding your Student’s SAT 10 Score Carole McLenithan & Mark Pendergrass May 22, 2015.
WestEd.org Washington Private Schools RtI Conference Follow- up Webinar October 16, 2012 Silvia DeRuvo Pam McCabe WestEd Center for Prevention and Early.
Terra Nova By Tammy Stegman, Robyn Ourada, Sandy Perry, & Kim Cotton.
Wechsler Individual Achievement Test-II
DIBELS.
Test Validity.
CHAPTER 8: Language and Bilingual Assessment
Verification Guidelines for Children with Disabilities
Chapter 7: Academic Assessment
Data Usage Response to Intervention
What are the SATS tests? The end of KS2 assessments are sometimes informally referred to as ‘SATS’. SATS week across the country begins on 14th May 2018.
Weschler Individual Achievement Test
What are the SATS tests? The end of KS2 assessments are sometimes informally referred to as ‘SATS’. SATS week across the country begins on 13th May 2019.
What are the SATS tests? The end of KS2 assessments are sometimes informally referred to as ‘SATS’. SATS week across the country begins on 13th May 2019.
Understanding and Using Standardized Tests
CHAPTER 14: Assessing Written Expression
Aims of the meeting To inform you of the end of Key Stage 2 assessment procedures. To give you a better understanding of what’s involved in the SATs tests.
What are the SATS tests? The end of KS2 assessments are sometimes informally referred to as ‘SATS’. SATS week across the country begins on 13th May 2019.
EBPS Year 6 SATs evening.
Wide Range Achievement Test 4 (WRAT 4)
Presentation transcript:

CHAPTER 1 Assessment in Social and Educational Contexts (Salvia, Ysseldyke & Bolt, 2012) Dr. Julie Esparza Brown SPED 512: Diagnostic Assessment Winter 2013 Chapters 1, 11, 12, 13, and 14 are included in this presentation

AGENDA – Week 3 Questions for the Good of the Group Instruction and Lab Time: Continue WJ-III Break Group activity to process Chapters 1, 3, 11, 12, and 14 Powerpoint overview of Chapters 1, 3, 11, 12, and 14

Individualized Support Schools must provide support as a function of individual student need To what extent is the current level of instruction working? How much instruction is needed? What kind of instruction is needed? Are additional supports necessary?

Assessment Defined Assessment is the process of collecting information (data) for the purpose of making decisions about students E.g. what to teach, how to teach, whether the student is eligible for special services

How Are Assessment Data Collected? Assessment extends beyond testing and may include: Record review Observations Tests Professional judgments Recollections

Why Care About Assessment? A direct link exists between assessment and the decisions that we make. Sometimes these decisions are markedly important. Thus, the procedures for gathering data are of interest to many people – and rightfully so. Why might students, parents, and teachers care? The general public? Certification boards?

Common Themes Moving Forward Not all tests are created equal Differences in content, reliability, validity, and utility Assessment practices are dynamic Changes in the political, technological, and cultural landscape drive a continuous process of revision

Common Themes Moving Forward The importance of assessment in education Educators are faced with difficult decisions Effective decision-making will require knowledge of effective assessment Assessment can be intimidating, but significant improvements have happened and continue to happen More confidence in the technical adequacy of instruments Improvements in the utility and relevance of assessment practices MTSS framework

CHAPTER 11 Assessment of Academic Achievement with Multiple-Skill Devices

Achievement Tests Norm-referenced Allow for comparisons between students Criterion-referenced Allow for comparisons between individual students and a skill benchmark. Why do we use achievement tests? Assist teachers in determining skills students do and do not have Inform instruction Academic screening Progress evaluation

Classifying Achievement Tests Diagnostic Achievement Number of students who can be tested High Less efficient administration – Dense content and numerous items allow teachers to uncover specific strengths and weaknesses Low More efficient administration – Comparisons between students can be made but very little power in determining strengths and weaknesses High Efficient administration – Typically only quantitative data are available Low Less efficient administration – Allows for more qualitative information about the student.

Considerations for Selecting a Test Four Factors Content validity What the test actually measures should match its intended use Stimulus-response modes Students should not be hindered by the manner of test administration or required response Standards used in state Relevant norms Does the student population being assessed match the population from which the normative data were acquired?

Tests of Academic Achievement Peabody Individual Achievement Test (PIAT-R/NU) Wide Range Achievement Test 4 (WRAT4) Wechsler Individual Achievement Test 3 (WIAT-III)

Peabody Individual Achievement Test- Revised/Normative Update (PIAT-R/NU) In general… Individually administered; norm-referenced for K-12 students Norm population Most recent update was completed in 1998 Representative of each grade level No changes to test structure

PIAT-R/NU Subtests Mathematics: 100 multiple-choice items assess students’ knowledge and application of math concepts and facts Reading recognition: 100 multiple-choice items require students to match and name letters and words General information: 100 questions presented orally. Content areas include social studies, science, sports, and fine arts. Reading comprehension: 81 multiple-choice items require students to select an appropriate answer following a reading passage Spelling: 100 items ranging in difficulty from kindergarten (letter naming) to high school (multiple-choice following verbal presentation) Written expression: Split into two levels. Level 1 assesses pre- writing skills and Level II requires story writing following a picture prompt

PIAT-R/NU Scores For all but one subtest (written expression), response to each item is pass/fail Raw scores converted into: Standard scores Percentile ranks Normal curve equivalents Stanines 3 composite scores Total reading Total test Written language

PIAT-R/NU Reliability and Validity Despite new norms, reliability and validity data are only available for the original PIAT-R (1989) Previous reliability and validity data are likely outdated Outdated tests may not be relevant in the current educational context

Wide Range Achievement Test 4 (WRAT4) In general… Individually administered minute test length depending on age (5-94 age range) Norm-referenced, but covers a limited sample of behaviors in 4 content areas Norm population Stratified across age, gender, ethnicity, geographic region, and parental education

WRAT4 Subtests Word Reading: The student is required to name letters and read words Sentence Comprehension: The student is shown sentences and fills in missing words Spelling: The student write down words as they are read aloud Math Computation: The student solves basic computation problems Scores Raw scores converted to: Standard scores, confidence intervals, percentiles, grade equivalents, and stanines Reading composite available Reliability Internal consistency and alternate-form data are sufficient for screening purposes Validity Performance increases with age WRAT4 is linked to other tests that have since been updated; additional evidence is necessary

Wechsler Individual Achievement Test- Third Edition (WIAT-III) General Diagnostic, norm-referenced achievement test Reading, mathematics, written expression, listening, and speaking Ages 4-19 Norm Population Stratified sampling was used to sample within several common demographic variables: Pre K – 12, age, race/ethnicity, sex, parent education, geographic region

WIAT-III Subtests and scores 16 subtests arranged into 7 domain composite scores and one total achievement score (structure provided on next slide) Raw scores converted to: Standard scores, percentile ranks, normal curve equivalents, stanines, age and grade equivalents, and growth scale value scores.

WIAT-III Subtests CompositeSubtest Basic ReadingWord Reading Pseudoword Decoding Reading Comprehension and Fluency Reading Comprehension Oral Reading Fluency Early Reading Skills MathematicsMath Problem Solving Numerical Operations Math FluencyMath Fluency – (Addition, Subtraction, & Multiplication) Written ExpressionAlphabet Writing Fluency Spelling Sentence Composition Essay Composition Oral ExpressionListening Comprehension Oral Expression

WIAT-III Reliability Adequate reliability evidence Split-half Test-retest Interrater agreement Validity Adequate validity evidence Content Construct Criterion Clinical Utility Stronger reliability and validity evidence increase the relevance of information derived from the WIAT- III

Getting the Most Out of an Achievement Test Helpful but not sufficient – most tests allow teachers to find an appropriate starting point What is the nature of the behaviors being sampled by the test? Need to seek out additional information concerning student strengths and weaknesses Which items did the student excel on? Which did he or she struggle with? Were there patterns of responding?

CHAPTER TWELVE Using Diagnostic Reading Tests

Why Do We Assess Reading? Reading is fundamental to success in our society, and therefore reading skill development should be closely monitored Diagnostic tests can help to plan appropriate intervention Diagnostic tests an help determine a student’s continuing need for special services

The Ways in Which Reading is Taught The effectiveness of different approaches is heavily debated Whole-word vs. code-based approaches Over time, research has supported the importance of phonemic awareness and phonics

Skills Assessed by Diagnostic Approaches Oral Reading Rate of Reading Oral Reading Errors Teacher pronunciation/aid Hesitation Gross mispronunciation Partial mispronunciation Omission of a word Insertion Substitution Repitition Inversion

Skills Assessed by Diagnostic Approaches (cont.) Reading Comprehension Literal comprehension Inferential comprehension Critical comprehension Affective comprehension Lexical comprehension

Skills Assessed by Diagnostic Approaches (cont.) Word-Attack Skills (i.e., word analysis skills) – use of letter-sound correspondence and sound blending to identify words Word Recognition Skills – “sight vocabulary”

Diagnostic Reading Tests See Table 12.1 Group Reading Assessment and Diagnostic Evaluation (GRADE) DIBELS Next Test of Phonemic Awareness – 2 Plus (TOPA 2+)

GRADE (Williams, 2001) Pre-school to 12 th grade 60 to 90 minutes Assesses pre-reading, reading readiness, vocabulary, comprehension, and oral language Missing some important demographic information for norm group, high total reliabilities (lower subscale reliabilities), adequate information to support validity of total score.

DIBELS Next (Good and Kaminski, 2010) Kindergarten-6 th grade Very brief administration (used for screening and monitoring) First Sound Fluency, Letter Naming Fluency, Phoneme Segmentation Fluency, Nonsense Word Fluency, Oral Reading Fluency, and DAZE (comprehension) Use of benchmark expectations or development of local norms Multiple administrations necessary for making important decisions

TOPA 2+ (Torgesen & Bryant, 2004) Ages 5 to 8 Phonemic awareness and letter-sound correspondence Good norms description Reliability better for kindergarteners than for more advanced students Adequate overall validity

CHAPTER 13 Using Diagnostic Mathematics Measures

Why Do We Assess Mathematics? Multiple-skill assessments provide broad levels of information, but lack specificity when compared to diagnostic assessments More intensive assessment of mathematics helps educators: Assess the extent to which current instruction is working Plan individualized instruction Make informed eligibility decisions

Ways to Teach Mathematics < 1960: Emphasis on basic facts and algorithms, deductive reasoning, and proofs 1960s: New Math; movement away from traditional approaches to mathematics instruction 1980s: Constructivist approach – standards-based math. Students construct knowledge with little or no help from teachers > 2000: Evidence supports explicit and systematic instruction (most similar to “traditional” approaches to instruction).

Behaviors Sampled by Diagnostic Mathematics Tests National Council of Teachers of Mathematics (NCTM) Content Standards – Number and operations – Algebra – Geometry – Measurement – Data analysis and probability Process Standards – Problem solving – Reasoning and proof – Communication – Connections – Representation

Specific Diagnostic Math Tests Group Mathematics Assessment and Diagnostic Evaluation (G●MADE) KeyMath-3 Diagnostic Assessment (KeyMath-3 DA)

G ● MADE General Group administered, norm-referenced, standards-based test Used to identify specific math skill strengths and weaknesses Students K-12 9 levels of difficulty teachers may select from

G ● MADE Subtests Concepts and communication Language, vocabulary, and representations of math Operations and computation Addition, subtraction, multiplication, and division Process and applications Applying appropriate operations and computations to solve word problems

G ● MADE Scores Raw scores converted to: Standard scores, grade scores, stanines, percentiles, and normal curve equivalents, and growth scale values. Norm population 2002 and 2003; nearly 28,000 students Selected based on geographic region, community type, socioeconomic status, students with disabilities

G ● MADE Reliability Acceptable levels of split-half and alternative form reliability Validity Based on NCTM standards (content validity) Strong criterion related evidence

KeyMath-3 Diagnostic Assessment (KeyMath- 3 DA) General Comprehensive assessment of math skills and concepts Untimed, individually administered, norm-referenced test; minutes 4 years 6 months through 21 years

KeyMath-3 DA Numeration Algebra Geometry Measurement Data analysis and probability Mental computation and estimation – Addition and subtraction – Multiplication and division – Foundations of problem solving – Applied problem solving Subtests

KeyMath-3 DA Scores Raw scores converted to: Standard scores, scaled scores, percentile rank, grade and age equivalents, growth scale values Composite scores Operations, basic concepts, and application Norm population 3,630 individuals 4, 6, and 21 years – demographic distribution approximates data reported in 2004 census

KeyMath-3 DA Reliability Internal consistency, alternate-form, and test-retest reliability Adequate for screening and diagnostic purposes Validity Adequate content and criterion-related validity evidence for all composite scores

CHAPTER 14 Using Measures of Oral and Written Language

Assessing Language Competence When assessing language skills, it is important to break language down into processes and measure each one – Language appears in written and verbal format Comprehension Expression – Normal levels of comprehension ≠ normal expression – Normal levels of expression ≠ normal comprehension

Terminology: Language as Code Phonology: Hearing and discriminating word sounds Semantics: Understanding vocabulary, meaning, and concepts Morphology and syntax: Understanding the grammatical structure of language Supralinguistics and pragmatics: Understanding a speaker’s or writer’s intentions

Assessing Oral and Written Language Why? Ability to converse and express thoughts is desirable Basic oral and written language skills underlie higher-order skills Considerations in assessing oral language Cultural diversity Differences in dialect are different, but not incorrect Disordered production of primary language or dialect should be considered when evaluating oral language Are the norms and materials appropriate? Developmental considerations Be aware of development norms for language acquisition

Assessing Oral and Written Language Considerations in assessing written language Form and Content Penmanship Spelling Style May be best assessed by evaluating students’ written work and developing tests (vocabulary, spelling, etc.) that parallel the curriculum

Methods for Observing Language Behavior Spontaneous language – Record what child says while talking to an adult or playing with toys – Prompts may be used for older children – Analyze phonology, semantics, morphology, syntax, and pragmatics Imitation – Require children to repeat words, phrases, or sentences produced by the examiner – Valid predictor of spontaneous production – Standardized imitation tasks often used in oral language assessment instruments Elicited language – A picture stimulus is used to elicit language

Methods for Observing Language Behavior Advantages and disadvantages of each method Spontaneous Advantages Most natural indicator of everyday language performance Informal testing environment Disadvantages Not a standardized procedure (more variability) Time-intensive Imitation Advantages Comprehensive Structured and efficient administration Disadvantages Auditory memory may affect results Hard to draw conclusions from accurate imitations Boring for child Elicited language Advantages Interesting and efficient Comprehensive Disadvantages Difficult to create valid measurement tools

Specific Oral and Written Language Tests Test of Written Language – Fourth Edition (TOWL-4) Test of Language Development: Primary – Fourth Edition (TOLD-P:4) Test of Language Development: Intermediate – Fourth Edition (TOLD-I:4) Oral and Written Language Scales (OWLS)

Test of Written Language – Fourth Edition (TOWL-4) General Norm-referenced Designed to assess written language competence of students between the ages of 9 and 17 Two formats Contrived Spontaneous

TOWL-4 Contrived – Vocabulary – Spelling – Punctuation – Logical sentences – Sentence combining Spontaneous Contextual conventions Story composition Subtests

TOWL-4 Scores Raw scores can be converted to percentile or standard scores Three composite scores and one overall score Contrived writing Logical sentences Spontaneous writing Overall writing

TOWL-4 Norms – Three age ranges: 9-11, 12-14, and – Distribution approximates nationwide school-age population for 2005; however, insufficient data are presented to confirm this Reliability – Variable data for internal consistency, stability, and inter-scorer agreement – 2 composites reliable for making educational decisions about students Validity – Content, construct, and predictive validity evidence is presented – Validity of inferences drawn from data is somewhat unclear

Test of Language Development: Primary – Fourth Edition (TOLD-P:4) General Norm-referenced, untimed, individually administered test 4-8 years of age Used to: Identify children significantly below their peers in oral language Determine specific strengths and weaknesses Document progress in remedial programs Measure oral language in research studies

TOLD-P:4 Subtests Picture vocabulary Relational vocabulary Oral vocabulary Syntactic understanding Sentence imitation Morphological completion Word discrimination Word analysis Word articulation Scores – Raw scores converted to: Age equivalents, percentile ranks, subtests scaled scores, and composite scores – Composite scores Listening Organizing Speaking Grammar Semantics Spoken language

TOLD-P:4 Norm population 1,108 individuals across 4 geographic regions Sample partitioned according to the 2007 census Reliability Adequate estimates of reliability Coefficient alpha Test-retest Scorer difference Validity Adequate content, construct, and criterion-related validity evidence

Test of Language Development: Intermediate – Fourth Edition (TOLD-I:4) General Norm-referenced, untimed, individually administered test 8-17 years of age Used to: Identify children significantly below their peers in oral language Determine specific strengths and weaknesses Document progress in remedial programs Measure oral language in research studies

TOLD-I:4 Subtests Sentence combining Picture vocabulary Word ordering Relational vocabulary Morphological comprehension Multiple meanings Norm population 1,097 students from 4 geographic regions Sample partitioned according to the 2007 census Scores – Raw scores converted to: Age equivalents, percentile ranks, subtests scaled scores, and composite scores – Composite scores Listening Organizing Speaking Grammar Semantics Spoken language

TOLD-I:4 Reliability Adequate estimates of reliability Coefficient alpha Test-retest Scorer difference Validity Adequate content, construct, and criterion- related validity evidence

Oral and Written Language Scales (OWLS) General Norm referenced, individually administered assessment of receptive and expressive language 3-21 years of age Subtests Listening comprehension Oral expression Written expression

OWLS Norm population 1,985 students matched to 1991 census data Scores Raw scores converted to: Standard scores, age equivalents, normal-curve equivalents, percentiles, and stanines Scores generated for each subtest, an oral language composite, and for a written language composite

OWLS Reliability Sufficient internal and test-retest reliability for screening, but not for making important decisions about individual students Validity Adequate criterion-related validity