Presentation is loading. Please wait.

Presentation is loading. Please wait.

Statistical Innovations for Health and Educational Research

Similar presentations


Presentation on theme: "Statistical Innovations for Health and Educational Research"— Presentation transcript:

1 Statistical Innovations for Health and Educational Research
Steven Andrew Culpepper Department of Statistics University of Illinois at Urbana-Champaign Educational Testing Service May 6, 2016

2 Methodological Areas for Innovation
Statistical inference for large-scale surveys involving latent variables. Collaboration with Trevor Park. AERA Grant # Fine-grained measurement of cognitive skills and attributes. Collaboration with Jeff Douglas, Yuguo Chen, and Yinghan Chen.

3 NAEP Overview NAEP provides data on the status of what America’s students know and can do in various subject areas. To reduce test time large-scale assessments administer a random sample of items in a given content area. Individual score estimates are more variable, so NAEP measures achievement for groups rather than individuals. A Multiple Imputation (MI) procedure (e.g., Rubin, 1987; Thomas, 1997) is employed to predict group proficiency with student, teacher, and school administrator survey responses. Researchers should instead use statistical methods that are designed for high-dimensional inference.

4 Why Model Selection in Large-Scale Testing?
Large-scale surveys include 100s of variables. Identifying the most relevant predictors of achievement is more challenging. Applied researchers need high-dimensional regression procedures for testing hypotheses. The developed model could be used to support the redesign of background questionnaires.

5 Bivariate Normal and Generalized Laplace Priors

6 NAEP Application The GAL, MVN, and AM software were applied to the NAEP mathematics data. NAEP administered J = 155 items to N = 175,200 8th grade students to assess mathematics achievement in K = 5 subject areas: algebra (J1 = 49); data analysis, statistics, and probability (J2 = 23); geometry (J3 = 30); measurement (J4 = 26); and number properties and operations (J5 = 27) The model included G = 148 groups with a total of V = 262 variables.

7 Comparison of Methods for 2011 NAEP Mathematics

8 Comparison of GAL and AM Software
Table: Race-Based Achievement Gaps using the GAL model and Plausible Values (PV) Content GAL PV Race Area EST SE African Am. 1 -0.498 0.027 -0.440 0.031 2 -0.515 0.034 -0.383 0.030 3 -0.559 -0.471 0.025 4 -0.595 -0.478 0.035 5 -0.573 0.029 -0.489 Note. Results are unweighted. 1 =algebra; 2 =data analysis, statistics, and probability; 3 =geometry; 4 =measurement; and 5 =number properties and operations.

9 Implications for Test Developers
83 out of 148 groups of variables statistically related with achievement. The GAL prior could be used to decide which background questions to retain, modify, or delete for subsequent data collections. Such efforts could optimize the time students, teachers, and school administrators dedicate to completing surveys.

10 Implications for Researchers
The GAL prior and AM software produced different estimates of the achievement gap. The GAL also yielded a more parsimonious model in Monte Carlo studies and the application. We would like to disseminate the methodology as an R package.

11 School in the Year 2000 - Postcard from the 1900 World Exhibition in Paris

12 Latent Variable Models
Latent variable models assume that a collection of unobserved traits or attributes underlie observed test or survey responses. Most studies consider broadly defined, continuous latent variables. Broadly defined continuous latent variables are useful for correlational research and ranking individuals on traits. Cognitive Diagnosis Models (CDMs) instead consider a set of discrete binary attributes/skills.

13 Cognitive Diagnosis Models
CDMs provide more detailed diagnostic information regarding student skills/attributes than is available with more broadly defined continuous traits in item response models. CDMs have been applied in several areas: Education Pathological gambling Anxiety disorders. The application of CDMs is dependent upon the availability of cognitive theory that specifies the skills and/or attributes necessary for success on a collection of tasks.

14 Purdue Spatial Visualization Test – Rotation (PSVT-R): Item #1

15 RSVT-R: Item #2

16 Continuous vs. Discrete Latent Variables
A continuous IRT model would assume test-takers’ broadly defined spatial abilities can be mapped to random variable θi A cognitive diagnosis model (CDM) would classify students into attribute classes, αIi = (αi1, , αiK) where 0 i does not have attribute k 1 i has attribute k We could classify individuals based upon four rotation attributes: αi1 = 90o x-axis, αi2 = 90o y-axis, αi3 = 180o x-axis, and αi4 = 180o y-axis. Skills could form a hierarchy such that αi1 must be mastered before αi3 and αi2 before αi4. αik =

17 Challenges CDMs require robust cognitive theory that clearly specifies the underlying attributes/skills. Cognitive theory is catalogued in the Q matrix. The unavailability of cognitive theory to specify Q limits widespread application of CDMs.

18 Statistical Advances Developed statistical methodology to estimate Q. The new procedure can be employed to assess existing cognitive theory or develop new theory. CDMs can be accurately applied to understand learning progressions.


Download ppt "Statistical Innovations for Health and Educational Research"

Similar presentations


Ads by Google