State Exemplar: Maryland’s Alternate Assessment Using Alternate Achievement Standards The Alternate Maryland School Assessment Presenters Sharon Hall U.S.

Slides:



Advertisements
Similar presentations
Response to Intervention (RtI) in Primary Grades
Advertisements

PD Plan Agenda August 26, 2008 PBTE Indicators Track
Growing Success Overview
Teacher Evaluation New Teacher Orientation August 15, 2013.
Standards-Based IEPs Writing Goals and Objectives
Portfolio Review Process Georgia Alternate Assessment.
Advanced Topics in Standard Setting. Methodology Implementation Validity of standard setting.
Issues of Technical Adequacy in Measuring Student Growth for Educator Effectiveness Stanley Rabinowitz, Ph.D. Director, Assessment & Standards Development.
Student Growth Developing Quality Growth Goals II
1 Referrals, Evaluations and Eligibility Determinations Office of Vocational and Educational Services for Individuals with Disabilities Special Education.
1 Alignment of Alternate Assessments to Grade-level Content Standards Brian Gong National Center for the Improvement of Educational Assessment Claudia.
Modified High School Assessment (Mod-HSA) Maryland State Board of Education August 26, 2008.
Alternative Maryland School Assessment (Alt-MSA)
Prepared by Jan Sheinker, Ed.D Points of view or opinions expressed in the paper are not necessarily those of the U.S. Department of Education, or Offices.
MCAS-Alt: Alternate Assessment in Massachusetts Technical Challenges and Approaches to Validity Daniel J. Wiener, Administrator of Inclusive Assessment.
New Hampshire Enhanced Assessment Initiative: Technical Documentation for Alternate Assessments Standard Setting Inclusive Assessment Seminar Marianne.
National Center on Educational Outcomes N C E O What the heck does proficiency mean for students with significant cognitive disabilities? Nancy Arnold,
Alignment Issues for AA- AAS Diane M. Browder, PhD University of North Carolina at Charlotte October 11, 2007.
Setting Alternate Achievement Standards Prepared by Sue Rigney U.S. Department of Education NCEO Teleconference March 21, 2005.
ASSESSMENT SYSTEMS FOR TSPC ACCREDITATION Assessment and Work Sample Conference January 13, 2012 Hilda Rosselli, Western Oregon University.
New Hampshire Enhanced Assessment Initiative: Technical Documentation for Alternate Assessments Alignment Inclusive Assessment Seminar Brian Gong Claudia.
Minnesota Manual of Accommodations for Students with Disabilities Training Guide
Grade 12 Subject Specific Ministry Training Sessions
Effingham County Who is a Gifted Student? A student who demonstrates a high degree of intellectual and/or creative ability, exhibits an exceptionally.
performance INDICATORs performance APPRAISAL RUBRIC
Facts About the Florida Alternate Assessment Created from “Facts About the Florida Alternate Assessment Online at:
Principles of Assessment
The New England Common Assessment Program (NECAP) Alignment Study December 5, 2006.
September 18, Links for Academic Learning Presented by- University of North Carolina at Charlotte Partners in the National Alternate Assessment.
NCCSAD Advisory Board1 Research Objective Two Alignment Methodologies Diane M. Browder, PhD Claudia Flowers, PhD University of North Carolina at Charlotte.
Wisconsin Extended Grade Band Standards
Curriculum and Learning Omaha Public Schools
1 Alignment of Standards, Large-scale Assessments, and Curriculum: A Review of the Methodological and Empirical Literature Meagan Karvonen Western Carolina.
Student Learning Objectives: Approval Criteria and Data Tracking September 9, 2013 This presentation contains copyrighted material used under the educational.
Stronge Teacher Effectiveness Performance Evaluation System
Accommodations in Oregon Oregon Department of Education Fall Conference 2009 Staff and Panel Presentation Dianna Carrizales ODE Mike Boyles Pam Prosise.
Georgia Alternate Assessment Eighth Annual Maryland Conference October 2007 Melissa Fincher, Georgia Department of Education Claudia Flowers, UNCC.
Assessing Students With Disabilities: IDEA and NCLB Working Together.
Classroom Assessment A Practical Guide for Educators by Craig A. Mertler Chapter 7 Portfolio Assessments.
Comprehensive Educator Effectiveness: New Guidance and Models Presentation for the Special Education Advisory Committee Virginia Department of Education.
Comprehensive Educator Effectiveness: New Guidance and Models Presentation for the Virginia Association of School Superintendents Annual Conference Patty.
Teaching Today: An Introduction to Education 8th edition
Student Learning Objectives: Approval Criteria and Data Tracking September 17, 2013 This presentation contains copyrighted material used under the educational.
After lunch - Mix it up! Arrange your tables so that everyone else seated at your table represents another district. 1.
Illustration of a Validity Argument for Two Alternate Assessment Approaches Presentation at the OSEP Project Directors’ Conference Steve Ferrara American.
1 Alignment of Alternate Assessments to Grade-level Content Standards Brian Gong National Center for the Improvement of Educational Assessment Claudia.
Standard Setting Results for the Oklahoma Alternate Assessment Program Dr. Michael Clark Research Scientist Psychometric & Research Services Pearson State.
VALUE/Multi-State Collaborative (MSC) to Advance Learning Outcomes Assessment Pilot Year Study Findings and Summary These slides summarize results from.
An Analysis of Three States Alignment Between Language Arts and Math Standards and Alternate Assessments Claudia Flowers Diane Browder* Lynn Ahlgrim-Delzell.
Student Learning Objectives. Introductions Training Norms Be present Actively participate in activities Respect time boundaries Use electronics respectfully.
Assessment Information from multiple sources that describes a student’s level of achievement Used to make educational decisions about students Gives feedback.
Bridge Year (Interim Adoption) Instructional Materials Criteria Facilitator:
Alternate Proficiency Assessment Erin Lichtenwalner.
What Are the Characteristics of an Effective Portfolio? By Jay Barrett.
Alternative Assessment Chapter 8 David Goh. Factors Increasing Awareness and Development of Alternative Assessment Educational reform movement Goals 2000,
Changes in Professional licensure Teacher evaluation system Training at Coastal Carolina University.
The Achievement Chart Mathematics Grades Note to Presenter:
Easy Curriculum Based Measurement (CBM). What is Easy CBM? EasyCBM® was designed by researchers at the University of Oregon as an integral part of an.
Onsite Quarterly Meeting SIPP PIPs June 13, 2012 Presenter: Christy Hormann, LMSW, CPHQ Project Leader-PIP Team.
WISCONSIN’S NEW RULE FOR SPECIFIC LEARNING DISABILITIES Effective December 1, 2010.
Spring 2012 Ohio’s Academic Content Standards - Extended for Students with Significant Cognitive Disabilities Increasing grade-level standard accessibility.
Best Practices in CMSD SLO Development A professional learning module for SLO developers and reviewers Copyright © 2015 American Institutes for Research.
1 Teacher Evaluation Institute July 23, 2013 Roanoke Virginia Department of Education Division of Teacher Education and Licensure.
American Institutes for Research
Claudia Flowers, Diane Browder, & Shawnee Wakeman UNC Charlotte
Assessing Academic Programs at IPFW
Links for Academic Learning: Planning An Alignment Study
TESTING AND EVALUATION IN EDUCATION GA 3113 lecture 1
Assessing Students With Disabilities: IDEA and NCLB Working Together
Claudia Flowers, Diane Browder, & Shawnee Wakeman UNC Charlotte
Presentation transcript:

State Exemplar: Maryland’s Alternate Assessment Using Alternate Achievement Standards The Alternate Maryland School Assessment Presenters Sharon Hall U.S. Department of Education Martin Kehe Maryland State Department of Education William Schafer University of Maryland

Session Summary This session highlights the Alternate Assessment based on Alternate Academic Achievement Standards in Maryland – The Alternate Maryland School Assessment (Alt-MSA) Discussion will focus on A description of the assessment and the systems-change process which was required to develop and implement the testing program Development of reading, mathematics and science item banks The process to ensure alignment with grade-level content standards and results and results of independent alignment studies Technical documentation and research agenda to support validity and reliability.

Agenda Developing Maryland’s AA-AAAS: A Systems Change Perspective Conceptual Framework Alt-MSA Design Developing the Mastery Objective Banks Evaluation of the Alt-MSA’s alignment with content standards Technical Documentation and Establishing a Research Agenda Support Validity and Reliability Questions and Answers

A Systems Change Perspective Process Collaboration Divisions of Special Education and Assessment Stakeholder Advisory Alt-MSA Facilitators Alt-MSA Facilitators and LACs MSDE and Vendor Instruction and Assessment Students assigned to age appropriate grade (for purposes of Alt-MSA) Local School System Grants

A Systems Change Perspective Content Reading and Mathematics mastery objectives and artifacts (evidence) linked with grade level content standards No program evaluation criteria

Maryland’s Alternate Assessment Design (Alt-MSA) Portfolio Assessment 10 Reading and 10 Mathematics Mastery Objectives (MOs) Evidence of Baseline (50% or less attained) Evidence of Mastery (80% - 100%): 1 artifact for each MO 2 Reading and 3 Mathematics MOs aligned with science Vocabulary and informational text; measurement and data analysis

What’s Assessed: Reading Maryland Reading Content Standards 1.0 General Reading Processes Phonemic awareness, phonics, fluency (2 MOs) Vocabulary (2 MOs; 1 aligned with science) General reading comprehension (2 MOs) 2.0 Comprehension of Informational Text (2 MOs; 1 aligned with science) 3.0 Comprehension of Literary Text (2 MOs)

What’s Assessed: Mathematics Algebra, Patterns, and Functions (2 MOs) Geometry (2 MOs) Measurement (2 MOs; 1 aligned with science) Statistics-Data Analysis (2 MOs aligned with science) Number Relationships and Computation (2 MOs)

What’s Assessed: Science (2008) Grades 5, 8, 10 Grades 5 and 8; select 1 MO each Earth/Space Science Life Science Chemistry Physics Environmental Science Grade 10 5 Life Science MOs

Steps in the Alt-MSA Process Step 1: September Principal meets with Test Examiner Teams Review results or conduct pre-assessment

Steps in the Alt-MSA Process Step 2: September-November TET selects or writes Mastery Objectives Principal reviews and submits Share with parents Revise (written) Mastery Objectives

Steps in the Alt-MSA Process Step 3: September-March Collect Baseline Data for Mastery Objectives: 50% or less accuracy Teach Mastery Objectives Assess Mastery Objectives Construct Portfolio

Standardized Number of mastery objectives assessed Format of mastery objectives Content standards/topics assessed All mos must have baseline data and evidence of mastery at 80%-100% Types of artifacts permissible Components of artifacts Training and Handbook provided Scoring training and procedures

MO Format

Evidence (Artifacts) Acceptable Artifacts (Primary Evidence) Videotapes-1 reading and 1 math mandatory Audiotape Student work (original) Data collection charts (original) Unacceptable Artifacts photographs, checklists, narrative descriptions

Artifact Requirements Aligned with Mastery Objective Must include baseline data that demonstrates student performs MO with 50% or less accuracy Data chart must show 3-5 demonstrations of instruction prior to mastery The observable, measurable student response must be evident (not “trial 1”) Mastery is 80%-100% accuracy Name, date, accuracy score, prompts

Scores and Condition Codes AMO is not aligned BArtifact is missing or not acceptable CArtifact is incomplete DArtifact does not align with MO, or components of MO are missing EData Chart does not show 3-5 observations of instruction on different days prior to demonstration of mastery FAccuracy score is not reported

Reliability: Scorer Training Conducted by contractor scoring director, MSDE always present Must attain 80% accuracy on each qualifying set Every portfolio is scored twice by 2 different teams Daily backreading by supervisors and scoring directors Daily inter-rater reliability data Twice weekly validity checks Ongoing retraining

Maryland’s Alt-MSA Report

Development of the Mastery Objective Banks Initial three years of program involved teachers writing individualized reading and mathematics Mastery Objectives (approximately 100,000 objectives each year) Necessary process to help staff learn the content standards Maryland and contractor staff reviewed 100% of MOs for alignment and technical quality

Mastery Objective Banks Prior to year 4, Maryland conducted an analysis of written MOs to create the MO Banks for reading and mathematics Banked items available in an online application, linked to and aligned with content standards Provided additional degree of standardization Process still allows for writing of customized MOs, as needed

Mastery Objective Banks In year 4, Baseline MO measurement was added Teachers take stock of where a student is, without prompts at the beginning of the year on each proposed MO This helps to ensure that students are learning and assessed on skills and knowledge that has not already been mastered Year 5 added Science MO Bank

Mastery Objective Banks

National Alternate Assessment Center (NAAC) National Alternate Assessment Center (NAAC) Alignment Study of the Alt-MSA

NAAC Alt-MSA Alignment Study Conducted by staff from University of North Carolina at Charlotte and Western Carolina University from March – August, 2007 Study was an investigation of the alignment of Alt-MSA Mastery Objectives in Reading and Mathematics to grade-level content standards

NAAC Alt-MSA Alignment Study Eight (8) criteria used to evaluate Developed in collaboration of content experts special educators and measurement experts at University of North Carolina at Charlotte (Browder, Wakeman, Flowers, Rickleman, Pugalee, & Karvonen, 2006) A stratified random sampling method (stratified on grade level) was used to select the portfolios, grades 3 – 8 and 10, 225 reading/231 mathematics

Criterion 1: The content is academic and includes the major domains/strands of the content area as reflected in state and national standards (e.g., reading, math, science) Outcome: Reading: 99% of MOs were rated academic Math: 94% of MOs were rated academic Alignment Results by Criterion

Criterion 2: The content is referenced to the student’s assigned grade level (based on chronological age) Outcome: Reading: 82% of the MOs reviewed were referenced to a grade level standard (2.0% were not referenced to a grade level standard. 16% were referenced to off-grade standards (K-2) which were referenced to the standards of phonics and phonemic awareness.) Math: 97% were referenced to a grade level standard Alignment Results by Criterion

Criterion 3: The focus of achievement maintains fidelity with the content of the original grade level standards (content centrality) and when possible, the specified performance Outcome Reading: 99% MOs rated as far or near for content centrality, 92% MOs rated partial or full performance centrality, and 90% rated as being linked to the MO Math: 92% MOs rated as far in content centrality, 92% MOs rated partial performance centrality, and 92% rated as being linked to the MO Alignment Results by Criterion

Criterion 4: The content differs from grade level in range, balance, and Depth of Knowledge (DOK), but matches high expectations set for students with significant cognitive disabilities. Outcome Reading: All the reading standards had multiple MOs that were linked to the standard and although 73% were rated at the depth of knowledge level of memorize/recall, there were MOs rated at the highest level of depth of knowledge levels (i.e., comprehension, application, and analysis) Math: MOs were aligned to all grade level standards and distributed across all levels of depth of knowledge except the lowest level (i.e., attention), with the largest percentage of MOs at the performance and analysis/synthesis/evaluation levels. Alignment Results by Criterion

Criterion 5: There is some differentiation in achievement across grade levels or grade bands. Outcome Reading: Overall the reading has good differentiation across grade levels Math: While there is some limited differentiation, some items were redundant from lower to upper grades Criterion 6: The expected achievement for students is for the students to show learning of grade referenced academic content Outcome: The Alt-MSA score is not augmented with program factors. However, in cases where more intrusive prompting is used, the level of inference that can be made is limited. Alignment Results by Criterion

Criterion 7: The potential barriers to demonstrating what students know and can do are minimized in the assessment Outcome: Alt-MSA minimizes barriers for the broadest range of heterogeneity within the population, because flexibility is built into the tasks teachers select. (92% of the MOs were accessible at an abstract level of symbolic communication, while the remaining MOs were accessible to students at a concrete level of symbolic communication). Criterion 8 : The instructional program promotes learning in the general curriculum Outcome: The Alt-MSA Handbook is well developed and covers the grade level domains that are included in alternate assessment. Some LEAs in MD have exemplary professional development materials. Alignment Results by Criterion

Study Summary Overall the Alt-MSA demonstrated good access to the general curriculum The Alt-MSA was well developed and covered the grade level standards The quality of the professional development materials varied across the different counties

Technical Documentation of the Alt-MSA

Sources Alt-MSA Technical Manuals (2004, 2005, 2006) Schafer, W. D. (2005). Technical Documentation for Alternate Assessments. Practical Assessment, Research and Evaluation, 10(10). At PAREonline.net. Marion, S. F. & Pellegrino, J. W. (2007). A validity framework for evaluating the technical adequacy of alternate assessments. Educational Measurement: Issues and Practice, 25(4), Report from the National Alternate Assessment Center from a panel review of the Alt-MSA. Contracted technical studies on Alt-MSA

Validity of the Criterion Is Always Important To judge proficiency in any assessment, a student’s score is compared with a criterion score Regular assessment: standard setting generates a criterion score for all examinees Regular assessment: the criterion score is assumed appropriate for everyone It defines an expectation for minimally acceptable performance It is interpreted in behavioral terms through achievement level descriptions

Criterion in Alternate Assessment A primary question in alternate assessment is Should the same criterion score should apply to everyone? Our answer was no, because behaviors that imply success for some students, imply failure for others This implies that flexible criteria are needed to judge the success of a student or of a teacher – unlike the regular assessment

Criterion Validity The quality of criteria is documented for the regular assessment through a standard setting study When criteria vary, then each different criterion needs to be documented So we need to consider both score and criterion reliability & validity for Alt-MSA.

Technical Research Agenda There are four sorts of technical research we should undertake: Reliability of Criteria Reliability of Scores Validity of Criteria Validity of Scores We will describe some examples and possibilities for each.

Reliability of Criteria Could see if the criteria (MOs) are internally consistent for a student in terms of difficulty, cognitive demand, and/or levels of the content elements they represent Could do that for, say, 9 samples of students: L-M-H degrees of challenge for L-M-H grade levels, Degree of challenge might be assessed by age of identification of disability or by location in the extended standards of last year’s MOs

Reliability of Scores 2007 rescore of a 5% sample of 2006 portfolios (n=266) showed agreement rates of 82%-89% for reading & 83%-89% for math A NAAC review concluded the inter-rater evidence of scorer reliability is strong Amount of evidence could be evaluated using Smith’s (2003) approach of modeling error using the binomial distribution to get decision accuracy estimates:

Decision Accuracy Study Assume each student produces a sample of size 10 from a binomial population of MOs Can use the binomial distribution to generate the probabilities of all outcomes (X=0 to10) for any π For convenience, use the midpoints of ten equally-spaced intervals for π (.05 ….95) Using X=0-50 for Basic, X=60-80 for Proficient, X= for Advanced yields:

Classification Probabilities for Students with Various πs πBasicProficientAdvanced

3x3 Decision Accuracy Collapsing across π with True Basic = , True Proficient = , True Advanced =.95: Classification True LevelBasic Proficient AdvancedTotal Advanced Proficient Basic P(Accurate) = =.7821 This assumes equally-weighted πs

Empirically Weighted πs Mastery Objectives Mastered in 2006 for Reading and Math (N = 4851 students) Percent MasteredReading PercentMath Percent

3x3 Decision Accuracy with Empirical Weights - Reading Observed Achievement Level True Level Basic Proficient Advanced Total Advanced Proficient Basic P(Accurate) = =.7908

NCLB requires decisions in terms of Proficient/Advanced vs. Basic Observed Level Group - Reading True Level BasicProficient or Advanced Proficient or Advanced Basic These are conditional probabilities – they sum to 1 by rows. P[Type I Error (taking action)] =.0451 P[Type II Error (taking no action)] =.1284 These are less than Cohen’s guidelines of.05 and.20.

3x3 Decision Accuracy with Empirical Weights - Math Observed Achievement Level True Level Basic Proficient Advanced Total Advanced Proficient Basic P(Accurate) = =.7942

NCLB requires decisions in terms of Proficient/Advanced vs. Basic Observed Level Group - Math True Level BasicProficient or Advanced Proficient or Advanced Basic These are conditional probabilities – they sum to 1 by rows. P[Type I Error (taking action)] =.0398 P[Type II Error (taking no action)] =.1365 These are also less than Cohen’s guidelines of.05 and.20.

Reliability of Scores Conclusions Decision accuracy of Reading is 79.1% Decision accuracy of Math is 79.4% Misclassification probabilities are FalseReadingMath Prof.12.8%13.6% Not Prof. 4.5% 4.0% These are within Cohen’s guidelines

Validity of Criteria: Content Evidence Could study MO development & review process for 9 samples of students, L-M-H degrees of challenge for L-M-H grade levels Could map student progress along content standard strands over time Could evaluate and monitor the use of the bank Could survey parents: are MOs too modest, about right, or too idealistic MSDE will conduct a new cut-score study

Validity of Criteria: Quantitative Evidence For n=267 same-student portfolio pairs from 2006 & % of 2007 reading MOs 90% of 2007 math MOs were completely new or more demanding than the respective student’s 2006MOs (suggesting growth) Alternate standard-setting studies could generate evidence about validity of the existing (or resulting) criteria:

Possible Alternate Standard Setting Study Approaches Develop percentage cut-scores for groups with different degrees of disability (e.g., modified Angoff) & articulate vertically & horizontally Establish criterion groups using an external criterion and identify cut scores that minimize classification errors (contrasting groups) Set cutpoints that match the percentages of students in the achievement levels in the general population (equipercentile)

Validity of Criteria: Consequential Evidence Could study IEPs to see if they have become more oriented toward academic goals over time Could study of the ability of Alt-MSA to drive instruction – e.g., do the enacted content standards move toward the assessed content standards?

Validity of Scores: Content Evidence Could study how well raters can categorize samples of artifacts into the content strand elements their MOs were designed to represent

Validity of Scores: Consequential Evidence Could survey stakeholders: How have the scores been used? How have the scores been misused?

Two Philosophical Issues Justification is needed for implementing flexible performance expectations all the way down to the individual student Justification is needed for using standardized percentages for success categories across the flexible performance expectations

Contact Information Sharon Hall – Martin Kehe – William Schafer –