Why Scale -- 1 Summarising data –Allows description of developing competence Construct validation –Dealing with many items rotated test forms –check how.

Slides:



Advertisements
Similar presentations
Psychometrics to Support RtI Assessment Design Michael C. Rodriguez University of Minnesota February 2010.
Advertisements

Implications and Extensions of Rasch Measurement.
1 Scaling of the Cognitive Data and Use of Student Performance Estimates Guide to the PISA Data Analysis ManualPISA Data Analysis Manual.
Part II Sigma Freud & Descriptive Statistics
Modified Achievement Tests for Students with Disabilities: Basic Psychometrics and Group Analyses Ryan J. Kettler Vanderbilt University CCSSO’s National.
IRT Equating Kolen & Brennan, IRT If data used fit the assumptions of the IRT model and good parameter estimates are obtained, we can estimate person.
AN OVERVIEW OF THE FAMILY OF RASCH MODELS Elena Kardanova
Developing Rubrics Presented by Frank H. Osborne, Ph. D. © 2015 EMSE 3123 Math and Science in Education 1.
Chapter 4 Validity.
Education 3504 Week 3 reliability & validity observation techniques checklists and rubrics.
VERTICAL SCALING H. Jane Rogers Neag School of Education University of Connecticut Presentation to the TNE Assessment Committee, October 30, 2006.
The item response theory in the development of children’s quality of life assessment instruments and data analysis The item response theory in the development.
MATH 310, FALL 2003 (Combinatorial Problem Solving) Lecture 16, Monday, October 6.
Item Response Theory. Shortcomings of Classical True Score Model Sample dependence Limitation to the specific test situation. Dependence on the parallel.
Firstname Lastname Nationality: XXXXXXX • Address
Presenter Name Presentation/event title xxxxx
A Value-Based Approach for Quantifying Scientific Problem Solving Effectiveness Within and Across Educational Systems Ron Stevens, Ph.D. IMMEX Project.
© UCLES 2013 Assessing the Fit of IRT Models in Language Testing Muhammad Naveed Khalid Ardeshir Geranpayeh.
MULTIPLES OF 2 By Preston, Lincoln and Blake. 2 X 1 =2 XX 2X1=2 1+1=2.
Major Outcomes of Science Instruction
The results from international assessments of adult literacy and numeracy skills Juliette Mendelovits CEET 17th Annual National Conference Friday 1 November.
Item Analysis: Classical and Beyond SCROLLA Symposium Measurement Theory and Item Analysis Modified for EPE/EDP 711 by Kelly Bradley on January 8, 2013.
Australian Council for Educational Research PISA for Development Technical Strand 2: Enhancement of PISA Cognitive Instruments Ray Adams John Cresswell.
Intelligent System Lab. (iLab) Southern Taiwan University of Science and Technology 1 Estimation of Item Difficulty Index Based on Item Response Theory.
Measurement Problems within Assessment: Can Rasch Analysis help us? Mike Horton Bipin Bhakta Alan Tennant.
Item Response Theory for Survey Data Analysis EPSY 5245 Michael C. Rodriguez.
7th Grade Math Final Review (20 % of Semester Grade)
1 Item Analysis - Outline 1. Types of test items A. Selected response items B. Constructed response items 2. Parts of test items 3. Guidelines for writing.
Introduction to plausible values National Research Coordinators Meeting Madrid, February 2010.
Analyzing Reliability and Validity in Outcomes Assessment (Part 1) Robert W. Lingard and Deborah K. van Alphen California State University, Northridge.
Classroom Assessments Checklists, Rating Scales, and Rubrics
Measuring Mathematical Knowledge for Teaching: Measurement and Modeling Issues in Constructing and Using Teacher Assessments DeAnn Huinker, Daniel A. Sass,
Using empirical feedback to develop a learning progression in science Karen Draney University of California, Berkeley.
EveMark Ed Enter question here. Click the sound icon to listen to each solution. Who has the best solution?
Divide by 8 page – groups of 8 Division Sentence 0 ÷ 8 = 0.
1 Item Analysis - Outline 1. Types of test items A. Selected response items B. Constructed response items 2. Parts of test items 3. Guidelines for writing.
Write an integer for each situation. 1. stock market down 56 points
Chapter 4: Variability. Variability Provides a quantitative measure of the degree to which scores in a distribution are spread out or clustered together.
Katherine L. McEldoon & Bethany Rittle-Johnson. Project Goals Develop an assessment of elementary students’ functional thinking abilities, an early algebra.
Nurhayati, M.Pd Indraprasta University Jakarta.  Validity : Does it measure what it is supposed to measure?  Reliability: How the representative is.
Item Analysis: Classical and Beyond SCROLLA Symposium Measurement Theory and Item Analysis Heriot Watt University 12th February 2003.
Reliability a measure is reliable if it gives the same information every time it is used. reliability is assessed by a number – typically a correlation.
Aligning Assessments to Monitor Growth in Math Achievement: A Validity Study Jack B. Monpas-Huber, Ph.D. Director of Assessment & Student Information Washington.
Obtaining International Benchmarks for States Through Statistical Linking: Presentation at the Institute of Education Sciences (IES) National Center for.
2. Main Test Theories: The Classical Test Theory (CTT) Psychometrics. 2011/12. Group A (English)
Item Response Theory Dan Mungas, Ph.D. Department of Neurology University of California, Davis.
LECTURE 14 NORMS, SCORES, AND EQUATING EPSY 625. NORMS Norm: sample of population Intent: representative of population Reality: hope to mirror population.
TEST SCORES INTERPRETATION - is a process of assigning meaning and usefulness to the scores obtained from classroom test. - This is necessary because.
1 Collecting and Interpreting Quantitative Data Deborah K. van Alphen and Robert W. Lingard California State University, Northridge.
Lesson 2 Main Test Theories: The Classical Test Theory (CTT)
Evaluation Of and For Learning
Literacy Focus: Reading
Assessment Research Centre Online Testing System (ARCOTS)
Using Item Response Theory to Track Longitudinal Course Changes
Exponential Functions
Introduction to the Validation Phase
Item Analysis: Classical and Beyond
Booklet Design and Equating
Survey What? It's a way of asking group or community members what they see as the most important needs of that group or community is. The results of the.
Analyzing Reliability and Validity in Outcomes Assessment Part 1
Gamma Theta Upsilon Recent GTU Initiates
Presenter Name Presentation/event title xxxxx
Instructions and Examples Version 1.5
Margaret Wu University of Melbourne
Analyzing Reliability and Validity in Outcomes Assessment
Item Analysis: Classical and Beyond
Item Analysis: Classical and Beyond
Fishbone (Project name) Material Personnel XXXXXX Environment Method
Collecting and Interpreting Quantitative Data
Presentation transcript:

Why Scale -- 1 Summarising data –Allows description of developing competence Construct validation –Dealing with many items rotated test forms –check how reasonable it is to summarise data (through sums, or weighted sums)

What do we want to achieve in our measurement? Locate students on a line of developing proficiency that describe what they know and can do. ================================ So, we need to make sure that Our measures are accurate (reliability); Our measures are indeed tapping into the skills we set out to measure (validity); Our measures are “invariant” even if different tests are used.

Properties of an Ideal Approach Scores we obtained are meaningful. Ann Bill Cath What can each of these students do? Scores are independent of the sample of items used If a different set of items are used, we will get the same results.

Using Raw Scores? Can raw scores provide the properties of an ideal measurement? Distances between differences in scores are not easily interpretable. Difficult to link item scores to person scores.

Equating raw scores % Score on the easy test Score on the hard test 100% A A A B B B C CC

Link Raw Scores on Items and Persons single digit addition Task Difficulties multi-step arithmetic word problems arithmetic with vulgar fractions 25% 50% 70% 90% ? Object Scores ? ? ? 90% 70% 50% 25%

Item Response Theory (IRT) Item response theory helps us address the shortcomings of raw scores –If item response data fit and IRT (Rasch) model, measurement is at its most powerful level. Person abilities and item difficulties are calibrated on the same scale. Meanings can be constructed to describe scores Student scores are independent of the particular set of items in the test. –IRT provides tools to assess the extent to which good measurement properties are achieved.

IRT IRT models give the probability of success of a person on items. IRT models are not deterministic, but probablistic. Given the item difficulty and person ability, one can compute the probability of success for each person on each item.

Building a Model Probability of Success Very low achievement Very high achievement

Imagine a middle difficulty task Probability of Success Very low achievement Very high achievement  

Item Characteristic Curve Probability of Success Very low achievement Very high achievement  

Item Difficulty -- 1 

Variation in item difficulty 11 22 33

Estimating Student Ability

3 | | | | X| | XX| | 2 XX| |9 22 XXX| | XXX| |6 16 XXXXX| | XXXXX| | XXXXXXX|* |31 XXXXXXX|* |2 30 XXXXXXXXX|* * * |13 XXXXXXXXXX|* * * * * |19 0 XXXXXXX|* * * * * * |5 32 XXXXXXXX|* * * * * | XXXXXXX|* | XXXXXXXX|* * | XXXXXXXXX| | XXXXXX| | XXXX|* |1 XXXX| | XX| | XXX| |25 XX| | X| | -3 X| |

3 | | | | X| | XX| | 2 XX| |9 22 XXX| | XXX| |6 16 XXXXX| | XXXXX| | XXXXXXX|* |31 XXXXXXX|* |2 30 XXXXXXXXX|* * * |13 XXXXXXXXXX|* * * * * |19 0 XXXXXXX|* * * * * * |5 32 XXXXXXXX|* * * * * | XXXXXXX|* | XXXXXXXX|* * | XXXXXXXXX| | XXXXXX| | XXXX|* |1 XXXX| | XX| | XXX| |25 XX| | X| | -3 X| | Tasks at level 1 require mainly recall of knowledge, with little interpretation or reasoning. Tasks at level 3 require doing mathematics in a somewhat "passive way", such as manipulating expressions, carrying out computations, verifying propositions, etc, when the modelling has been done, the strategies given, the propositions stated, or the needed information is explicit. Tasks at level 5 require doing mathematics in an active way: finding suitable strategies, selecting information, posing problems, constructing explanations and so on.

3 | | | | X| | XX| | 2 XX| |9 22 XXX| | XXX| |6 16 XXXXX| | XXXXX| | XXXXXXX|* |31 XXXXXXX|* |2 30 XXXXXXXXX|* * * |13 XXXXXXXXXX|* * * * * |19 0 XXXXXXX|* * * * * * |5 32 XXXXXXXX|* * * * * | XXXXXXX|* | XXXXXXXX|* * | XXXXXXXXX| | XXXXXX| | XXXX|* |1 XXXX| | XX| | XXX| |25 XX| | X| | -3 X| | Distance between the location of items and students fully describe students’ chances of success on the item This property permits the use of described scales Why a Rasch Model?