Presentation is loading. Please wait.

Presentation is loading. Please wait.

Exploring Skills Diagnostic Opportunities at Measured Progress Lou DiBello William Stout Meetings with Measured Progress February 11-12, 2008.

Similar presentations


Presentation on theme: "Exploring Skills Diagnostic Opportunities at Measured Progress Lou DiBello William Stout Meetings with Measured Progress February 11-12, 2008."— Presentation transcript:

1 Exploring Skills Diagnostic Opportunities at Measured Progress Lou DiBello William Stout Meetings with Measured Progress February 11-12, 2008

2 Learning Science Research Institute--UIC--Informative Assessment Initiative2 I. Overview, Goals Purpose Establish a clear conceptual framework and language for understanding and discussing diagnostic assessment Identify practical steps for developing diagnostic assessments Consider challenges Explore possibilities for collaborative work between IAI or AIARE and MP

3 Learning Science Research Institute--UIC--Informative Assessment Initiative3 Presentation Agenda I. Overview, goals, purpose II. Background III. Assessment as evidentiary system IV. Practical Steps and Challenges V. Possibilities for Collaborative Work VI. Wrap-up

4 Learning Science Research Institute--UIC--Informative Assessment Initiative4 II. Background

5 Learning Science Research Institute--UIC--Informative Assessment Initiative5 Who we are Our primary expertise is theoretical and applied psychometrics Our primary interest is broader: to develop the engineering science of diagnostic assessment In addition to science and theory, we are focused on practical issues: costs, production, sustainability, scalability, implementation, evaluation, dissemination

6 Learning Science Research Institute--UIC--Informative Assessment Initiative6 Who we are-Bill Bill is Professor Emeritus, Dept. of Statistics, University of Illinois at Urbana-Champaign Co-lead of Informative Assessment Initiative in the Learning Sciences Research Institute, University of Illinois at Chicago Co-founder of (LLC) Applied Informative Assessment Research Enterprises (AIARE). Past director of ETS External Diagnostic Research Team (the X Team)

7 Learning Science Research Institute--UIC--Informative Assessment Initiative7 Who we are-Lou Lou is Co-lead of Informative Assessment Initiative Research Professor and Associate Director of Learning Sciences Research Institute, University of Illinois at Chicago Co-founder of (LLC) Applied Informative Assessment Research Enterprises (AIARE) Former Director ETS Profile Scoring Initiative —Contract Manager for the X Team

8 Learning Science Research Institute--UIC--Informative Assessment Initiative8 Who we are-Bill and Lou Bill: Distinguished psychometrician; past president of the Psychometric Society NCME scientific award winner for foundational work in skills diagnostic modeling, dimensionality, and item and test bias detection Lou: Recently served as a research director within the testing industry Directed effort to operationalize diagnostic assessment for a large scale operational assessment

9 Learning Science Research Institute--UIC--Informative Assessment Initiative9 Our Affiliations Informative Assessment Initiative (IAI) One of three initiatives that make up the Learning Sciences Research Institute (LSRI) at UIC LSRI directed by Jim Pellegrino & Susan Goldman The other two initiatives are Cognitive Science and Math and Science Education Applied Informative Assessment Research Enterprises (AIARE); a new LLC that owns and licenses Arpeggio software

10 Learning Science Research Institute--UIC--Informative Assessment Initiative10 Joint Work: Bill, Lou (& Louis) Pursuing research and development at the forefront of a new skills diagnostic psychometric research area Invited co-editors of an upcoming special issue of the Jour. of Educational Measurement on skills diagnosis Served as invited co-authors of a foundational paper on psychometric approaches to cognitive diagnostic assessment, just published in the Handbook of Statistics (DiBello, Roussos & Stout, 2007) Other publications in refereed academic journals Directed numerous research and development projects, both within academia and private sector

11 Learning Science Research Institute--UIC--Informative Assessment Initiative11 III. A View of Assessments as Evidentiary Systems

12 Learning Science Research Institute--UIC--Informative Assessment Initiative12 A view of Assessment Design Assessment as Evidentiary System Assessment design is deciding: “…how one wants to frame inferences about students, what data one needs to see, how one arranges situations to get the pertinent data, and how one justifies reasoning from the data to inferences about the student.“ Junker

13 Learning Science Research Institute--UIC--Informative Assessment Initiative13 Integrated Classroom or Learning Environment Instruction Curriculum Assessment

14 Learning Science Research Institute--UIC--Informative Assessment Initiative14 Assessment ObservationInterpretation Cognition Instruction Curriculum Integrated Classroom or Learning Environment

15 Learning Science Research Institute--UIC--Informative Assessment Initiative15 Assessment Triangle -Pellegrino et al Assessment ObservationInterpretation Cognition

16 Learning Science Research Institute--UIC--Informative Assessment Initiative16 Comprehensive View of Assessment Assessment conceptually involves: Cognition Curriculum design Instruction Teaching practice Teacher preparation Psychometrics Assessment design Testing Industry Marketing and Implementation

17 Learning Science Research Institute--UIC--Informative Assessment Initiative17 Validity—Thinking about Assessment Quality and Value Level 1: test design was soundly based on cognitive principles—”inner” and “outer” Level 2: test meets quantitatively defined requirements for internal diagnostic quality Level 3: independent confirmation, outside the test, demonstrates that test-based diagnostic skills inferences are accurate— includes protocol studies and criterion validity Level 4: consequential validity: proper use of assessment and differential instruction leads to improved teaching and learning

18 Learning Science Research Institute--UIC--Informative Assessment Initiative18 Validity Studies Level 1: design –expert analyses Level 2: internal diagnostic quality—gather data and compute reliability and fit Level 3: independent confirmation— includes protocol studies and criterion validity Level 4: consequential validity—studies of learning outcomes, teacher practices, teacher preparation

19 Learning Science Research Institute--UIC--Informative Assessment Initiative19 Practical Assessment Validity Assessment validity provides a conceptual framework for thinking about diagnostics Validity studies are expensive, and it is not practical to address very many of the aspects of validity at once. A reasonable strategy is to identify specific validity targets to address as part of diagnostic development and stage them over time

20 Learning Science Research Institute--UIC--Informative Assessment Initiative20 IV. Practical Steps and Challenges in Developing Successful Skills Diagnostic Assessments

21 Learning Science Research Institute--UIC--Informative Assessment Initiative21 Implementation Paradigm Describe assessment purpose Describe a model for the skills space Develop and analyze the assessment items Specify an appropriate psychometric model linking observable performance to latent skills Select statistical methods for model estimation and evaluating the results Develop methods for reporting assessment results to examinees, teachers, and others

22 Learning Science Research Institute--UIC--Informative Assessment Initiative22 Walking through the Steps The next few slides walk through the steps of the Implementation Paradigm: Purpose Skills space Tasks/items Formative Reports Psychometric Model: Fusion Model Model calibration: Arpeggio

23 Learning Science Research Institute--UIC--Informative Assessment Initiative23 Diagnostic Assessment Purposes Provide timely information about students’ learning and understanding Support teachers, learners, parents Support teacher actions, decisions, planning track students’ progress toward standards diagnose deficiencies group by skill profiles for instruction and practice Curriculum evaluation and planning

24 Learning Science Research Institute--UIC--Informative Assessment Initiative24 Skills Framework A cognitive diagnostic model (e.g. the Fusion Model) requires item-skills links as input The skills framework=set of skills selected for measurement and reporting For K-12 classrooms, the skills must be: aligned with standards and curriculum aligned with teacher actions supportable statistically

25 Learning Science Research Institute--UIC--Informative Assessment Initiative25 Q matrix—encodes the skills required for each item Items=rows Skills=columns 7x5 matrix For example: Item 2 requires skills 1, 3, and 4

26 Learning Science Research Institute--UIC--Informative Assessment Initiative26 Skills Example—PTS3 Reading A good starting point for PTS3 Reading skills is: Skill 1: Literary Skill 2: Informational Skill 3: Comprehension & Analysis Skill 4: Reading Process & Language Skills

27 Learning Science Research Institute--UIC--Informative Assessment Initiative27 PTS3-Math Initial Skills A good starting point for PTS3 Math skills is: Skill 1: Numbers and Operations Skill 2: Algebra Skill 3: Geometry & Measurement Skill 4: Data Analysis & Probability

28 Learning Science Research Institute--UIC--Informative Assessment Initiative28 Skills—Practical Constraints Alternative skills representations may be supported within the substantive literature Theory may suggest that 100 skills influence performance within a particular mathematics test domain. A 50 minute assessment cannot accurately measure 100 skills, and teachers could not manage diagnostic 100-skill profiles for each student Skills must be simultaneously comprehensive, of “coarse” granularity, aligned with standards, curriculum and instruction

29 Learning Science Research Institute--UIC--Informative Assessment Initiative29 Skills Pragmatics—Focus Developing skills frameworks is usually a creative act. A small number of foundational or core skills must be determined that are: Important and useful to measure Statistically supportable by the assessment So that other skills can be ignored with impunity Think of this as focusing the assessment design in light of the diagnostic purpose— assumptions about what to measure and what to “ignore”

30 Learning Science Research Institute--UIC--Informative Assessment Initiative30 Diagnostic “Score Reports” A key component of diagnostic assessment is the “score report, ” construed broadly as any and all information presented to users as a result of assessment performance A diagnostic assessment reports a profile of scores such as mastery/nonmastery on each skill In addition, the score report can and should include information that promotes better teaching and learning: possible action steps for teacher or learner suggestions to student for improvement interpretive information

31 Learning Science Research Institute--UIC--Informative Assessment Initiative31 Score Reporting Statistics An Arpeggio analysis produces (as noted in Bill’s Monday presentation): Item/skill level parameters For each student a posterior probability of mastery for each skill For each student, a classification of master/non-master for each skill based on the above posterior probability Examinee probability distribution on the skill space Estimates of skill classification accuracy Fit statistics The skills profiles are based on 2 nd and 3 rd above

32 Learning Science Research Institute--UIC--Informative Assessment Initiative32 Skills Classification Accuracy The Fusion Model and Arpeggio provide several estimated indices of skills classification accuracy or reliability: CCR=individual skill correct classification rate TCR=test-retest consistency rate (like classical reliability) Skill Pattern correctness or consistency rates As is the case for standard unidimensional IRT reliability, these measures are internal to the model and the data—no external criteria

33 Learning Science Research Institute--UIC--Informative Assessment Initiative33 Evaluating the assessment Once the model is calibrated, we estimate the skills classification accuracy and calculate certain measures of fit that are directly relevant to the diagnostic purpose of the assessment. Both reflect on: Which skills are selected and their definitions Skill codings in Q matrix Model suitability Statistical analysis procedures employed

34 Learning Science Research Institute--UIC--Informative Assessment Initiative34 Model-Data Fit We evaluate model-data fit by computing fit indices directly relevant to the diagnostic purpose. Considering MCMC convergence, item parameter values and fit, we examine: Are items appropriate and of “good quality” Are skills framework and Q matrix appropriate Is the test “well designed”—enough good items for each skill; no fatal information-blocking in the Q matrix; good alignment between difficult items and difficult skills; other aspects of good design Are any aspects of the model suspect

35 Learning Science Research Institute--UIC--Informative Assessment Initiative35 V. Possibilities for Collaborative Work

36 Learning Science Research Institute--UIC--Informative Assessment Initiative36 Status of Diagnostic Research DiBello and Stout have collaborated with other researchers, including Louis Roussos Their studies provide a scientific and applied foundation for cognitive diagnostic research The IRT based skills-diagnostic Fusion Model has been developed, along with software called Arpeggio for calibrating the Fusion Model that employs the Markov Chain Monte Carlo (MCMC) statistical methodology

37 Learning Science Research Institute--UIC--Informative Assessment Initiative37 “X Team” 3-year R&D output Arpeggio R & D was directed within ETS by DiBello and externally by Stout 46 Research Studies 18 studies on modeling issues 4 studies on skills-level linking methods 4 studies on skills-level reliability 2 studies on techniques for data-model fit 10 applied studies 8 theoretical studies backing the algorithms 5 Descriptions of Algorithms and sw code 12 Sets of user documentations

38 Learning Science Research Institute--UIC--Informative Assessment Initiative38 Assets and Resources Estimated $12M of investment underlies the development of Arpeggio software system and underlying theory, research studies, analyses Resources: Informative Assessment Initiative within LSRI-UIC (IAI), Applied Informative Assessment Research Enterprises (AIARE)-LLC Ownership of Arpeggio software and broad rights to license patent Louis Roussos of MP is a major researcher, inventor, collaborator, developer of Arpeggio

39 Learning Science Research Institute--UIC--Informative Assessment Initiative39 Current Status of Arpeggio AIARE owns copyright and trademark to all Arpeggio software and has unconstrained access to patent rights, including right to license them to others Practical reality: freedom to fashion any agreement that is mutually beneficial to MP and AIARE ETS is guaranteed a share of royalties

40 Learning Science Research Institute--UIC--Informative Assessment Initiative40 IAI Current Activities (as background) NSF project to do formative assessment using established math curricula ($3M-funded) IES proposal for classroom assessment($2M-applied for) More grants likely to be applied for concerning skills level formative and embedded assessments (testing as integral part of curricular learning process) Upgrade and expand capabilities of Arpeggio and the Fusion Model (technical grant proposals planned) Develop, upgrade, and disseminate the engineering science of diagnostic assessment in educational settings Work with testing companies, such as ETS and CTB

41 Learning Science Research Institute--UIC--Informative Assessment Initiative41 IAI Project Ideas (some maybe of interest to MP) Developing Specific Diagnostic Assessments & Pilot Trials The Practice of Developing Lists of Skills for Diagnostic Measurement & Reporting Assessment-Curriculum-Instruction Linkages Diagnostic Validity Studies Foundational and Applied Psychometric Diagnostic Research

42 Learning Science Research Institute--UIC--Informative Assessment Initiative42 Diagnostic Assessment Design Develop diagnostic scoring capability for PTS3 and other existing tests Design new diagnostic tests Needs and capacity analyses What market needs exist How might diagnostic assessment help teachers and learners, directly in the classroom, indirectly through summative or accountability tests What capacity do teachers and curricula have to incorporate and use diagnostic assessment

43 Learning Science Research Institute--UIC--Informative Assessment Initiative43 Planned Foundational and Applied Diagnostic Psychometric Research Diagnostic Modeling Skills-level assessment accuracy Model-data Fit Computational speed and performance Efficacy Studies Group-level diagnostic survey testing a la NAEP Embedded Assessments Growth Modeling

44 Learning Science Research Institute--UIC--Informative Assessment Initiative44 Concrete Possibilities with MP Proposal is that Measured Progress and we explore possible cooperation that can aid MP bring to fruition its strong interest in skills diagnosis Seems like a superb opportunity to pursue Turn PTS3 in stages into a skills diagnostic test Grants/contracts Collaboration on research projects of joint interest Explore diagnostic applications to state tests AERA/NCME proposals

45 Learning Science Research Institute--UIC--Informative Assessment Initiative45 VI. Wrap-up We are mapping the dimensions of what we envision as a new engineering science of diagnostic assessment Focused on supporting teachers and learners, school districts, state departments of education With due attention to sustainability and scalability to support commercial and operational success As a natural mode of dissemination, we are appealing especially to testing companies interested in assessment products and services that improve teaching and learning

46 Learning Science Research Institute--UIC--Informative Assessment Initiative46 Discussion Next Steps


Download ppt "Exploring Skills Diagnostic Opportunities at Measured Progress Lou DiBello William Stout Meetings with Measured Progress February 11-12, 2008."

Similar presentations


Ads by Google