The Poisson-Gamma model for speed tests

Slides:



Advertisements
Similar presentations
The t Test for Two Related Samples. Why Might We Have Related Samples? Repeated Measures Repeated Measures A study in which a single sample of individuals.
Advertisements

Introduction to Hypothesis Testing
Which Test? Which Test? Explorin g Data Explorin g Data Planning a Study Planning a Study Anticipat.
Section 7.2. Mean of a probability distribution is the long- run average outcome, µ, or µ x. Also called the expected value of x, or E(X). µ x = x i P.
STATISTICS Sampling and Sampling Distributions
STATISTICS Random Variables and Probability Distributions
STATISTICS INTERVAL ESTIMATION Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering National Taiwan University.
STATISTICS POINT ESTIMATION Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering National Taiwan University.
STATISTICS Univariate Distributions
1 Superior Safety in Noninferiority Trials David R. Bristol To appear in Biometrical Journal, 2005.
Statistics for Improving the Efficiency of Public Administration Daniel Peña Universidad Carlos III Madrid, Spain NTTS 2009 Brussels.
Discrete Random Variables and Probability Distributions
Hypothesis Tests: Two Independent Samples
Dept of Biomedical Engineering, Medical Informatics Linköpings universitet, Linköping, Sweden A Data Pre-processing Method to Increase.
THE STANDARD NORMAL DISTRIBUTION Individual normal distributions all have means and standard deviations that relate to them specifically. In order to compare.
Detecting Spam Zombies by Monitoring Outgoing Messages Zhenhai Duan Department of Computer Science Florida State University.
Chapter 8 Estimation Understandable Statistics Ninth Edition
Heibatollah Baghi, and Mastee Badii
Basics of Statistical Estimation
STATISTICAL INFERENCE PART IV LOCATION AND SCALE PARAMETERS 1.
Chapter 3 Some Special Distributions Math 6203 Fall 2009 Instructor: Ayona Chatterjee.
Psychometric Aspects of Linking Tests to the CEF Norman Verhelst National Institute for Educational Measurement (Cito) Arnhem – The Netherlands.
Dr. Engr. Sami ur Rahman Data Analysis Lecture 4: Binomial Distribution.
Lesson Eight Standardized Test. Contents Components of a Standardized test Reasons for the Name “Standardized” Reasons for Using a Standardized Test Scaling.
Probability Distributions
CSCE 582: Bayesian Networks Paper Presentation conducted by Nick Stiffler Ben Fine.
Item Response Theory. Shortcomings of Classical True Score Model Sample dependence Limitation to the specific test situation. Dependence on the parallel.
Engineering Probability and Statistics - SE-205 -Chap 3 By S. O. Duffuaa.
Review of normal distribution. Exercise Solution.
A Review of Probability Models
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Discrete Random Variables Chapter 4.

Poisson Random Variable Provides model for data that represent the number of occurrences of a specified event in a given unit of time X represents the.
Random Sampling, Point Estimation and Maximum Likelihood.
Measuring of student subject competencies by SAM: regional experience Elena Kardanova National Research University Higher School of Economics.
Biostatistics, statistical software VII. Non-parametric tests: Wilcoxon’s signed rank test, Mann-Whitney U-test, Kruskal- Wallis test, Spearman’ rank correlation.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 5 Discrete Random Variables.
Stats/Methods I JEOPARDY. Jeopardy Validity Research Strategies Frequency Distributions Descriptive Stats Grab Bag $100 $200$200 $300 $500 $400 $300 $400.
Using the IRT and Many-Facet Rasch Analysis for Test Improvement “ALIGNING TRAINING AND TESTING IN SUPPORT OF INTEROPERABILITY” Desislava Dimitrova, Dimitar.
Stats Probability Theory Summary. The sample Space, S The sample space, S, for a random phenomena is the set of all possible outcomes.
NATIONAL CONFERENCE ON STUDENT ASSESSMENT JUNE 22, 2011 ORLANDO, FL.
Gary W. Phillips American Institutes for Research CCSSO 2014 National Conference on Student Assessment (NCSA) New Orleans June 25-27, 2014 Multi State.
Latent regression models. Where does the probability come from? Why isn’t the model deterministic. Each item tests something unique – We are interested.
Chapter 31Introduction to Statistical Quality Control, 7th Edition by Douglas C. Montgomery. Copyright (c) 2012 John Wiley & Sons, Inc.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Chapter 5 Discrete Random Variables.
Chapter 7: The Distribution of Sample Means
Lecture Notes 9 Prediction Limits

Principles of Language Assessment
Intelligence Andrea Mejia Spring 2017.
STAT 312 Chapter 7 - Statistical Intervals Based on a Single Sample
Statistics: The Z score and the normal distribution
Chapter Six Normal Curves and Sampling Probability Distributions
Two Sample Tests When do use independent
Booklet Design and Equating
CSCI 5822 Probabilistic Models of Human and Machine Learning
CI for μ When σ is Unknown
PSY 614 Instructor: Emily Bullock, Ph.D.
Preliminaries: Distributions
Histograms of grades in two classes, each of 200 students
CHAPTER 6 Statistical Inference & Hypothesis Testing
CHAPTER 15 SUMMARY Chapter Specifics
Statistics II: An Overview of Statistics
What are their purposes? What kinds?
Statistics for the Social Sciences
Notes: Sample Means
Chapter 8 VALIDITY AND RELIABILITY
Statistical Inference for the Mean: t-test
Assessment Chapter 3.
Presentation transcript:

The Poisson-Gamma model for speed tests Norman Verhelst Frans Kamphuis National Institute for Educational Measurement Arnhem, The Netherlands

The student monitoring system Measurement of individual development Common scale Estimation of distribution (norms) Twice per grade (M3, E3,…,M8) Several subjects Arithmetic Reading comprehension Technical reading

Two types of speed tests Basic observation is the time to complete a task AVI cards Basic observation is the number of completed subtasks within the time limit Tempotests (TT) Three Minute Test (TMT)

Example tempotest (E4) Op de politieschool spelen ze ook rook koor een soort toneel Het lijkt wel wat op ‘politie en boefje spelen stelpen slepen’. Net zoals op de basisschool. Wat poe doe boe je bij een gevecht? Je pistool trekken? Nee, dat mag zomen zomaar zomer niet.

Example TMT Easy version Hard version as fee oom uur zee oor … poot (=150) Hard version banden geluid tante beker kuiken koffer … brandweerwagen (=150)

Models Measurement model: Poisson Structural model: Gamma What is the relation between the (latent) ability and the test performance? Structural model: Gamma The distribution of the latent ability in one or more populations? (M3, E3, M4,…,M8)

Measurement model: Poisson (1)

Measurement model: Poisson (2)

Parameter estimation: incomplete design (JML)

Person parameters

Design TMT 3 difficulty levels (1, 2, 3) For each level: three parallell versions (a, b, c) Each student participates twice: medio and end of same grade At each administration: 3 cards of levels 1, 2 and 3 (in that sequence) M3: only cards 1 and 2

Two step procedure Estimate the task parameters σi JML = CML Estimate latent distribution while fixing the task parameters at their CML -estimate

Advantage

Structural model: distribution of reading speed (θ)

Marginal distribution of the sum score s

Negative Binomial (Gamma-Poisson)

Negative binomial

EAP

Reliability

Validation (tempo test)

Validation (tempo test)

Validation (TMT)

Latent class model Population consists of two latent classes of size π and 1 - π respectively The latent variable is gamma distributed in each class Parameters π α1 en β1 α2 en β2 EM-algorithm

Validation (TMT)

Validation (TMT)

Norms (TMT)

Thank you

Example: student v Task i dvi 1 8 0.93 - 2 1.11 8.88 3 6 0.85 4 1.05 8 0.93 - 2 1.11 8.88 3 6 0.85 4 1.05 6.30 5 1.09 δv : 15.18

Problems SE(π) large Local maxima? Thick right tail of observations >2 classes? Initial estimates Homogeneity of test material Local independence

Averages (1000 replications) Class 1 Class 2 Overall Mean 28.15 44.07 35.99 SD 2.71 3.22 0.43

Standard deviations (1000 rep.) Class 1 Class 2 Overall Mean 13.31 17.44 17.66 SD 2.21 1.68 0.47