Presentation is loading. Please wait.

Presentation is loading. Please wait.

State Exemplar: Maryland’s Alternate Assessment Using Alternate Achievement Standards The Alternate Maryland School Assessment Presenters Sharon Hall U.S.

Similar presentations


Presentation on theme: "State Exemplar: Maryland’s Alternate Assessment Using Alternate Achievement Standards The Alternate Maryland School Assessment Presenters Sharon Hall U.S."— Presentation transcript:

1 State Exemplar: Maryland’s Alternate Assessment Using Alternate Achievement Standards The Alternate Maryland School Assessment Presenters Sharon Hall U.S. Department of Education Martin Kehe Maryland State Department of Education William Schafer University of Maryland

2 Session Summary This session highlights the Alternate Assessment based on Alternate Academic Achievement Standards in Maryland – The Alternate Maryland School Assessment (Alt-MSA) Discussion will focus on A description of the assessment and the systems-change process which was required to develop and implement the testing program Development of reading, mathematics and science item banks The process to ensure alignment with grade-level content standards and results and results of independent alignment studies Technical documentation and research agenda to support validity and reliability.

3 Agenda Developing Maryland’s AA-AAAS: A Systems Change Perspective Conceptual Framework Alt-MSA Design Developing the Mastery Objective Banks Evaluation of the Alt-MSA’s alignment with content standards Technical Documentation and Establishing a Research Agenda Support Validity and Reliability Questions and Answers

4 A Systems Change Perspective Process Collaboration Divisions of Special Education and Assessment Stakeholder Advisory Alt-MSA Facilitators Alt-MSA Facilitators and LACs MSDE and Vendor Instruction and Assessment Students assigned to age appropriate grade (for purposes of Alt-MSA) Local School System Grants

5 A Systems Change Perspective Content Reading and Mathematics mastery objectives and artifacts (evidence) linked with grade level content standards No program evaluation criteria

6 Maryland’s Alternate Assessment Design (Alt-MSA) Portfolio Assessment 10 Reading and 10 Mathematics Mastery Objectives (MOs) Evidence of Baseline (50% or less attained) Evidence of Mastery (80% - 100%): 1 artifact for each MO 2 Reading and 3 Mathematics MOs aligned with science Vocabulary and informational text; measurement and data analysis

7 What’s Assessed: Reading Maryland Reading Content Standards 1.0 General Reading Processes Phonemic awareness, phonics, fluency (2 MOs) Vocabulary (2 MOs; 1 aligned with science) General reading comprehension (2 MOs) 2.0 Comprehension of Informational Text (2 MOs; 1 aligned with science) 3.0 Comprehension of Literary Text (2 MOs)

8 What’s Assessed: Mathematics Algebra, Patterns, and Functions (2 MOs) Geometry (2 MOs) Measurement (2 MOs; 1 aligned with science) Statistics-Data Analysis (2 MOs aligned with science) Number Relationships and Computation (2 MOs)

9 What’s Assessed: Science (2008) Grades 5, 8, 10 Grades 5 and 8; select 1 MO each Earth/Space Science Life Science Chemistry Physics Environmental Science Grade 10 5 Life Science MOs

10 Steps in the Alt-MSA Process Step 1: September Principal meets with Test Examiner Teams Review results or conduct pre-assessment

11 Steps in the Alt-MSA Process Step 2: September-November TET selects or writes Mastery Objectives Principal reviews and submits Share with parents Revise (written) Mastery Objectives

12 Steps in the Alt-MSA Process Step 3: September-March Collect Baseline Data for Mastery Objectives: 50% or less accuracy Teach Mastery Objectives Assess Mastery Objectives Construct Portfolio

13 Standardized Number of mastery objectives assessed Format of mastery objectives Content standards/topics assessed All mos must have baseline data and evidence of mastery at 80%-100% Types of artifacts permissible Components of artifacts Training and Handbook provided Scoring training and procedures

14 MO Format

15 Evidence (Artifacts) Acceptable Artifacts (Primary Evidence) Videotapes-1 reading and 1 math mandatory Audiotape Student work (original) Data collection charts (original) Unacceptable Artifacts photographs, checklists, narrative descriptions

16 Artifact Requirements Aligned with Mastery Objective Must include baseline data that demonstrates student performs MO with 50% or less accuracy Data chart must show 3-5 demonstrations of instruction prior to mastery The observable, measurable student response must be evident (not “trial 1”) Mastery is 80%-100% accuracy Name, date, accuracy score, prompts

17 Scores and Condition Codes AMO is not aligned BArtifact is missing or not acceptable CArtifact is incomplete DArtifact does not align with MO, or components of MO are missing EData Chart does not show 3-5 observations of instruction on different days prior to demonstration of mastery FAccuracy score is not reported

18 Reliability: Scorer Training Conducted by contractor scoring director, MSDE always present Must attain 80% accuracy on each qualifying set Every portfolio is scored twice by 2 different teams Daily backreading by supervisors and scoring directors Daily inter-rater reliability data Twice weekly validity checks Ongoing retraining

19 Maryland’s Alt-MSA Report

20 Development of the Mastery Objective Banks Initial three years of program involved teachers writing individualized reading and mathematics Mastery Objectives (approximately 100,000 objectives each year) Necessary process to help staff learn the content standards Maryland and contractor staff reviewed 100% of MOs for alignment and technical quality

21 Mastery Objective Banks Prior to year 4, Maryland conducted an analysis of written MOs to create the MO Banks for reading and mathematics Banked items available in an online application, linked to and aligned with content standards Provided additional degree of standardization Process still allows for writing of customized MOs, as needed

22 Mastery Objective Banks In year 4, Baseline MO measurement was added Teachers take stock of where a student is, without prompts at the beginning of the year on each proposed MO This helps to ensure that students are learning and assessed on skills and knowledge that has not already been mastered Year 5 added Science MO Bank

23 Mastery Objective Banks

24

25

26

27

28 National Alternate Assessment Center (NAAC) National Alternate Assessment Center (NAAC) Alignment Study of the Alt-MSA

29 NAAC Alt-MSA Alignment Study Conducted by staff from University of North Carolina at Charlotte and Western Carolina University from March – August, 2007 Study was an investigation of the alignment of Alt-MSA Mastery Objectives in Reading and Mathematics to grade-level content standards

30 NAAC Alt-MSA Alignment Study Eight (8) criteria used to evaluate Developed in collaboration of content experts special educators and measurement experts at University of North Carolina at Charlotte (Browder, Wakeman, Flowers, Rickleman, Pugalee, & Karvonen, 2006) A stratified random sampling method (stratified on grade level) was used to select the portfolios, grades 3 – 8 and 10, 225 reading/231 mathematics

31 Criterion 1: The content is academic and includes the major domains/strands of the content area as reflected in state and national standards (e.g., reading, math, science) Outcome: Reading: 99% of MOs were rated academic Math: 94% of MOs were rated academic Alignment Results by Criterion

32 Criterion 2: The content is referenced to the student’s assigned grade level (based on chronological age) Outcome: Reading: 82% of the MOs reviewed were referenced to a grade level standard (2.0% were not referenced to a grade level standard. 16% were referenced to off-grade standards (K-2) which were referenced to the standards of phonics and phonemic awareness.) Math: 97% were referenced to a grade level standard Alignment Results by Criterion

33 Criterion 3: The focus of achievement maintains fidelity with the content of the original grade level standards (content centrality) and when possible, the specified performance Outcome Reading: 99% MOs rated as far or near for content centrality, 92% MOs rated partial or full performance centrality, and 90% rated as being linked to the MO Math: 92% MOs rated as far in content centrality, 92% MOs rated partial performance centrality, and 92% rated as being linked to the MO Alignment Results by Criterion

34 Criterion 4: The content differs from grade level in range, balance, and Depth of Knowledge (DOK), but matches high expectations set for students with significant cognitive disabilities. Outcome Reading: All the reading standards had multiple MOs that were linked to the standard and although 73% were rated at the depth of knowledge level of memorize/recall, there were MOs rated at the highest level of depth of knowledge levels (i.e., comprehension, application, and analysis) Math: MOs were aligned to all grade level standards and distributed across all levels of depth of knowledge except the lowest level (i.e., attention), with the largest percentage of MOs at the performance and analysis/synthesis/evaluation levels. Alignment Results by Criterion

35 Criterion 5: There is some differentiation in achievement across grade levels or grade bands. Outcome Reading: Overall the reading has good differentiation across grade levels Math: While there is some limited differentiation, some items were redundant from lower to upper grades Criterion 6: The expected achievement for students is for the students to show learning of grade referenced academic content Outcome: The Alt-MSA score is not augmented with program factors. However, in cases where more intrusive prompting is used, the level of inference that can be made is limited. Alignment Results by Criterion

36 Criterion 7: The potential barriers to demonstrating what students know and can do are minimized in the assessment Outcome: Alt-MSA minimizes barriers for the broadest range of heterogeneity within the population, because flexibility is built into the tasks teachers select. (92% of the MOs were accessible at an abstract level of symbolic communication, while the remaining MOs were accessible to students at a concrete level of symbolic communication). Criterion 8 : The instructional program promotes learning in the general curriculum Outcome: The Alt-MSA Handbook is well developed and covers the grade level domains that are included in alternate assessment. Some LEAs in MD have exemplary professional development materials. Alignment Results by Criterion

37 Study Summary Overall the Alt-MSA demonstrated good access to the general curriculum The Alt-MSA was well developed and covered the grade level standards The quality of the professional development materials varied across the different counties

38 Technical Documentation of the Alt-MSA

39 Sources Alt-MSA Technical Manuals (2004, 2005, 2006) Schafer, W. D. (2005). Technical Documentation for Alternate Assessments. Practical Assessment, Research and Evaluation, 10(10). At PAREonline.net. Marion, S. F. & Pellegrino, J. W. (2007). A validity framework for evaluating the technical adequacy of alternate assessments. Educational Measurement: Issues and Practice, 25(4), 47-57. Report from the National Alternate Assessment Center from a panel review of the Alt-MSA. Contracted technical studies on Alt-MSA

40 Validity of the Criterion Is Always Important To judge proficiency in any assessment, a student’s score is compared with a criterion score Regular assessment: standard setting generates a criterion score for all examinees Regular assessment: the criterion score is assumed appropriate for everyone It defines an expectation for minimally acceptable performance It is interpreted in behavioral terms through achievement level descriptions

41 Criterion in Alternate Assessment A primary question in alternate assessment is Should the same criterion score should apply to everyone? Our answer was no, because behaviors that imply success for some students, imply failure for others This implies that flexible criteria are needed to judge the success of a student or of a teacher – unlike the regular assessment

42 Criterion Validity The quality of criteria is documented for the regular assessment through a standard setting study When criteria vary, then each different criterion needs to be documented So we need to consider both score and criterion reliability & validity for Alt-MSA.

43 Technical Research Agenda There are four sorts of technical research we should undertake: Reliability of Criteria Reliability of Scores Validity of Criteria Validity of Scores We will describe some examples and possibilities for each.

44 Reliability of Criteria Could see if the criteria (MOs) are internally consistent for a student in terms of difficulty, cognitive demand, and/or levels of the content elements they represent Could do that for, say, 9 samples of students: L-M-H degrees of challenge for L-M-H grade levels, Degree of challenge might be assessed by age of identification of disability or by location in the extended standards of last year’s MOs

45 Reliability of Scores 2007 rescore of a 5% sample of 2006 portfolios (n=266) showed agreement rates of 82%-89% for reading & 83%-89% for math A NAAC review concluded the inter-rater evidence of scorer reliability is strong Amount of evidence could be evaluated using Smith’s (2003) approach of modeling error using the binomial distribution to get decision accuracy estimates:

46 Decision Accuracy Study Assume each student produces a sample of size 10 from a binomial population of MOs Can use the binomial distribution to generate the probabilities of all outcomes (X=0 to10) for any π For convenience, use the midpoints of ten equally-spaced intervals for π (.05 ….95) Using X=0-50 for Basic, X=60-80 for Proficient, X=90-100 for Advanced yields:

47 Classification Probabilities for Students with Various πs πBasicProficientAdvanced.95.0001.0861.9138.85.0098.4458.5443.75.0781.6779.2440.65.2485.6656.0860.55.4956.4812.0232.45.7384.2571.0045.35.9052.0944.0005.25.9803.0207.0000.15.9986.0013.0000.051.000.0000.0000

48 3x3 Decision Accuracy Collapsing across π with True Basic =.05-.55, True Proficient =.65-.85, True Advanced =.95: Classification True LevelBasic Proficient AdvancedTotal Advanced.0000.0086.0914.1000 Proficient.0336.1789.0874.3000 Basic.5118.0855.0028.6000 P(Accurate) =.5118 +.1789 +.0914 =.7821 This assumes equally-weighted πs

49 Empirically Weighted πs Mastery Objectives Mastered in 2006 for Reading and Math (N = 4851 students) Percent MasteredReading PercentMath Percent 10021.826.4 9016.116.7 8011.610.3 70 8.0 7.8 60 6.7 6.1 50 5.5 5.8 40 4.9 4.6 30 5.1 4.1 20 4.7 4.1 10 6.7 6.3 0 6.9 7.7

50 3x3 Decision Accuracy with Empirical Weights - Reading Observed Achievement Level True Level Basic Proficient Advanced Total Advanced.0000.0258.2726.2984 Proficient.0274.1768.1057.3099 Basic.3414.0486.0017.3917 P(Accurate) =.3414 +.1768 +.2726 =.7908

51 NCLB requires decisions in terms of Proficient/Advanced vs. Basic Observed Level Group - Reading True Level BasicProficient or Advanced Proficient or Advanced.0451.9549 Basic.8716.1284 These are conditional probabilities – they sum to 1 by rows. P[Type I Error (taking action)] =.0451 P[Type II Error (taking no action)] =.1284 These are less than Cohen’s guidelines of.05 and.20.

52 3x3 Decision Accuracy with Empirical Weights - Math Observed Achievement Level True Level Basic Proficient Advanced Total Advanced.0000.0299.3174.3474 Proficient.0256.1676.1014.2946 Basic.3092.0472.0017.3581 P(Accurate) =.3092 +.1676 +.3174 =.7942

53 NCLB requires decisions in terms of Proficient/Advanced vs. Basic Observed Level Group - Math True Level BasicProficient or Advanced Proficient or Advanced.0398.9602 Basic.8635.1365 These are conditional probabilities – they sum to 1 by rows. P[Type I Error (taking action)] =.0398 P[Type II Error (taking no action)] =.1365 These are also less than Cohen’s guidelines of.05 and.20.

54 Reliability of Scores Conclusions Decision accuracy of Reading is 79.1% Decision accuracy of Math is 79.4% Misclassification probabilities are FalseReadingMath Prof.12.8%13.6% Not Prof. 4.5% 4.0% These are within Cohen’s guidelines

55 Validity of Criteria: Content Evidence Could study MO development & review process for 9 samples of students, L-M-H degrees of challenge for L-M-H grade levels Could map student progress along content standard strands over time Could evaluate and monitor the use of the bank Could survey parents: are MOs too modest, about right, or too idealistic MSDE will conduct a new cut-score study

56 Validity of Criteria: Quantitative Evidence For n=267 same-student portfolio pairs from 2006 & 2007 95% of 2007 reading MOs 90% of 2007 math MOs were completely new or more demanding than the respective student’s 2006MOs (suggesting growth) Alternate standard-setting studies could generate evidence about validity of the existing (or resulting) criteria:

57 Possible Alternate Standard Setting Study Approaches Develop percentage cut-scores for groups with different degrees of disability (e.g., modified Angoff) & articulate vertically & horizontally Establish criterion groups using an external criterion and identify cut scores that minimize classification errors (contrasting groups) Set cutpoints that match the percentages of students in the achievement levels in the general population (equipercentile)

58 Validity of Criteria: Consequential Evidence Could study IEPs to see if they have become more oriented toward academic goals over time Could study of the ability of Alt-MSA to drive instruction – e.g., do the enacted content standards move toward the assessed content standards?

59 Validity of Scores: Content Evidence Could study how well raters can categorize samples of artifacts into the content strand elements their MOs were designed to represent

60 Validity of Scores: Consequential Evidence Could survey stakeholders: How have the scores been used? How have the scores been misused?

61 Two Philosophical Issues Justification is needed for implementing flexible performance expectations all the way down to the individual student Justification is needed for using standardized percentages for success categories across the flexible performance expectations

62 Contact Information Sharon Hall – Sharon.Hall@ed.govSharon.Hall@ed.gov Martin Kehe – mkehe@msde.state.md.usmkehe@msde.state.md.us William Schafer – wschafer@umd.eduwschafer@umd.edu


Download ppt "State Exemplar: Maryland’s Alternate Assessment Using Alternate Achievement Standards The Alternate Maryland School Assessment Presenters Sharon Hall U.S."

Similar presentations


Ads by Google