Presentation is loading. Please wait.

Presentation is loading. Please wait.

Estimating Growth when Content Specifications Change: A Multidimensional IRT Approach Mark D. Reckase Tianli Li Michigan State University.

Similar presentations


Presentation on theme: "Estimating Growth when Content Specifications Change: A Multidimensional IRT Approach Mark D. Reckase Tianli Li Michigan State University."— Presentation transcript:

1 Estimating Growth when Content Specifications Change: A Multidimensional IRT Approach Mark D. Reckase Tianli Li Michigan State University

2 The Problem State curriculum frameworks often change from one grade to the next reflecting the addition of new instructional content.  For example, at grade 7 algebra may be introduced as an instructional goal.  At grade 6, algebra is not an important component of the curriculum. Tests at the two grades reflect the instructional content so the 6 th grade test does not include algebra and the 7 th grade test does. How can the score scales of these tests be linked?

3 Research Questions What do changes on the linked score scale mean, when the scale is produced using the usual unidimensional IRT models? Can multidimensional IRT be used to form vertical scales? If so, how do the results compare to the unidimensional results?

4 The Approach State testing data were analyzed using multidimensional IRT to develop a realistic model for the test data at two grade levels. The results of the real data analyses were idealized to create the specifications for simulating the tests at two grade levels. Simulate data with known structure to determine how unidimensional and multidimensional procedures function.

5 The Simulated Data Design Grade 6 – two major constructs  Arithmetic  Problem Solving Grade 7 – three major constructs  Arithmetic  Problem Solving  Algebra

6 Simulated Test Structure Test LevelAlgebraArithmeticProblem Solving Total Grade 6017 (4)23 (6)40 (10) Grade 711 (0)11 (4)18 (6)40 (10) Note: The numbers in parentheses are the common items between the two forms of the tests.

7 Mean Vectors at each Grade Level Class LevelAlgebraArithmeticProblem Solving Grade 6 Grade 7 -1.5 (-1.50) 0 (.03).5 (.51).7 (.73) -.2 (-.21) 0 (.01) Note: Values in parentheses are the observed means from the simulated data

8 Covariance Matrices Covariance Matrix for Grade 6 AlgebraArithmeticProblem Solving Algebra.25 (.25)0 (.00) Arithmetic0 (.00).8 (.84).7 (.76) Problem Solving0 (.00).7 (.76)1.2 (1.29) Covariance Matrix for Grade 7 AlgebraArithmeticProblem Solving Algebra1 (1.05).4 (.42).6 (.64) Arithmetic.4 (.42).6 (.60).3 (.32) Problem Solving.6 (.64).3 (.32)1 (1.02) Note: Values in parentheses are estimated from the simulated data.

9 Orientation of Items

10 Effect Size Built into Data AlgebraArithmetic Problem Solving 1.9.26.21

11 Unidimensional Basis for Comparison Imagine that the full set of 70 items from both test levels are administered to the students at both grade levels. The matrix of 2000 + 2000 students from the two grades by 70 items can be analyzed with the unidimensional models to serve as a basis for comparison for the vertical scaling result. Analyze the matrix using 2pl and Rasch model.

12 2PL Solution

13 Rasch Model Solution

14 Vertical Scaling Analysis Common-item concurrent calibration BILOGMG  Off grade items coded as not reached  Both 2pl and Rasch model used for analysis Determine effect size of difference in mean of two grade levels

15 Vertically Scaled Effect Sizes 2PL Model 70 Items Rasch Model 70 Items 2PL Model Concurrent Rasch Model Concurrent Mean (SD) Grade 6 -.54 (.78)-.42 (.93)-.22 (1.16)-.14 (1.06) Mean (SD) Grade 7.56 (1.13).45 (1.15).26 (1.20).21 (1.38) Effect Size1.13.83.41.28

16 Vertically Scaled Effect Sizes Linked effect size is smaller than full data effect size. Rasch effect size is less than 2pl effect size. Full data set effect size is less than modeled effect size.

17 Alternative Linking Method Common-item, separate calibration Common item parameter relationship was poor

18 MIRT Analysis Full data analysis with TESTFACT  Three dimensional analysis  Determine effect size for each dimension  Correlate each estimated  with the generating  s to determine meaning of the results.

19 MIRT Effect Sizes θ1θ1 θ2θ2 θ3θ3 Mean (SD) Total.01 (.95)-.01 (.90).05 (.72) Mean (SD) 6-.57 (.54).16 (.99).03 (.74) Mean (SD) 7.60 (.90)-.19 (.77).06 (.69) Effect Size1.56-.40.05

20 Correlation between True and Estimated  Values Est θ 1 Est θ 2 Est θ 3 True θ 1.92-.08.02 True θ 2.47.50-.18 True θ 3.46.80-.03

21 Interpretation of MIRT Solution Results are difficult to interpret because of the default procedures in TESTFACT. Solution needs to be rotated to have axes align with content dimensions. Current solution shows that    is related to algebra and shows the big algebra effect.    is a combination of arithmetic and problem solving with the emphasis on problem solving.  Most likely it has the sign of the a-parameters reversed.

22 Concurrent MIRT Analysis Use concurrent calibration of data from the two grade levels.  Three dimensional solution  No rotation Determine effect sizes and correlations with true  values.

23 Concurrent MIRT Calibration θ1θ1 θ2θ2 θ3θ3 Mean (SD) Total.06 (.75)-.09 (.57)-.38 (1.01) Mean (SD) 6-.02 (.87)-.29 (.56).18 (.64) Mean (SD) 7.14 (.59).10 (.50)-.94 (.99) Effect Size.22.74-1.34

24 Concurrent MIRT Calibration Est θ 1 Est θ 2 Est θ 3 True θ 1.16.57-.87 True θ 2.54.02-.40 True θ 3.77-.05-.43

25 Concurrent MIRT Calibration Scale on Dimension 3 is reversed and it has a large effect size (algebra). Dimension 1 is most related to arithmetic and problem solving with a moderate effect size. Dimension 2 is moderately related to algebra and has a large effect size. The overall result gives a reasonable estimate of effects, but the dimensions need to be rotated to match the constructs.

26 Conclusions Unidimensional linking of the two level tests underestimate the effect size. Rasch model gives a smaller effect size than the two parameter logistic model. MIRT solution shows promise.  Need to determine how to rotate solution to match constructs.  TESTFACT has problems converging on estimates because of mismatch between assumptions and reality.


Download ppt "Estimating Growth when Content Specifications Change: A Multidimensional IRT Approach Mark D. Reckase Tianli Li Michigan State University."

Similar presentations


Ads by Google