Comparison of Reliability Measures under Factor Analysis and Item Response Theory —Ying Cheng , Ke-Hai Yuan , and Cheng Liu Presented by Zhu Jinxin.

Slides:



Advertisements
Similar presentations
Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.
Advertisements

How Should We Assess the Fit of Rasch-Type Models? Approximating the Power of Goodness-of-fit Statistics in Categorical Data Analysis Alberto Maydeu-Olivares.
Logistic Regression Psy 524 Ainsworth.
Some (Simplified) Steps for Creating a Personality Questionnaire Generate an item pool Administer the items to a sample of people Assess the uni-dimensionality.
Likelihood Ratio, Wald, and Lagrange Multiplier (Score) Tests
Chapter 7 Title and Outline 1 7 Sampling Distributions and Point Estimation of Parameters 7-1 Point Estimation 7-2 Sampling Distributions and the Central.
Attributes Data Binomial and Poisson Data. Discrete Data All data comes in Discrete form. For Measurement data, in principle, it is on a continuous scale,
Nguyen Ngoc Anh Nguyen Ha Trang
ITEM RESPONSE MODELING OF PRESENCE-SEVERITY ITEMS: APPLICATION TO MEASUREMENT OF PATIENT-REPORTED OUTCOMES Ying Liu and Jay Verkuilen.
Part II Knowing How to Assess Chapter 5 Minimizing Error p115 Review of Appl 644 – Measurement Theory – Reliability – Validity Assessment is broader term.
A Method for Estimating the Correlations Between Observed and IRT Latent Variables or Between Pairs of IRT Latent Variables Alan Nicewander Pacific Metrics.
Statistics II: An Overview of Statistics. Outline for Statistics II Lecture: SPSS Syntax – Some examples. Normal Distribution Curve. Sampling Distribution.
When Measurement Models and Factor Models Conflict: Maximizing Internal Consistency James M. Graham, Ph.D. Western Washington University ABSTRACT: The.
Maximum Likelihood We have studied the OLS estimator. It only applies under certain assumptions In particular,  ~ N(0, 2 ) But what if the sampling distribution.
Today Today: Chapter 9 Assignment: 9.2, 9.4, 9.42 (Geo(p)=“geometric distribution”), 9-R9(a,b) Recommended Questions: 9.1, 9.8, 9.20, 9.23, 9.25.
Introduction to Educational Statistics
Session 3 Normal Distribution Scores Reliability.
A Different Way to Think About Measurement Development: An Introduction to Item Response Theory (IRT) Joseph Olsen, Dean Busby, & Lena Chiu Jan 23, 2015.
Alpha to Omega and Beyond! Presented by Michael Toland Educational Psychology & Dominique Zephyr Applied Statistics Lab.
An Introduction to Logistic Regression
LECTURE 16 STRUCTURAL EQUATION MODELING.
Leedy and Ormrod Ch. 11 Gray Ch. 14
Item response modeling of paired comparison and ranking data.
LECTURE 6 RELIABILITY. Reliability is a proportion of variance measure (squared variable) Defined as the proportion of observed score (x) variance due.
MEASUREMENT MODELS. BASIC EQUATION x =  + e x = observed score  = true (latent) score: represents the score that would be obtained over many independent.
© Willett, Harvard University Graduate School of Education, 8/27/2015S052/I.3(c) – Slide 1 More details can be found in the “Course Objectives and Content”
Introduction to plausible values National Research Coordinators Meeting Madrid, February 2010.
The emotional distress of children with cancer in China: An Item Response Analysis of C-Ped-PROMIS Anxiety and Depression Short Forms Yanyan Liu 1, Changrong.
Introduction Neuropsychological Symptoms Scale The Neuropsychological Symptoms Scale (NSS; Dean, 2010) was designed for use in the clinical interview to.
Chapter 7 Estimates and Sample Sizes
University of Ottawa - Bio 4118 – Applied Biostatistics © Antoine Morin and Scott Findlay 08/10/ :23 PM 1 Some basic statistical concepts, statistics.
Instrumentation (cont.) February 28 Note: Measurement Plan Due Next Week.
Dealing with Omitted and Not- Reached Items in Competence Tests: Evaluating Approaches Accounting for Missing Responses in Item Response Theory Models.
Learning Theory Reza Shadmehr Linear and quadratic decision boundaries Kernel estimates of density Missing data.
CJT 765: Structural Equation Modeling Class 8: Confirmatory Factory Analysis.
The Triangle of Statistical Inference: Likelihoood Data Scientific Model Probability Model Inference.
Measurement Models: Exploratory and Confirmatory Factor Analysis James G. Anderson, Ph.D. Purdue University.
Maximum Likelihood Estimation Methods of Economic Investigation Lecture 17.
INDE 6335 ENGINEERING ADMINISTRATION SURVEY DESIGN Dr. Christopher A. Chung Dept. of Industrial Engineering.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 07: BAYESIAN ESTIMATION (Cont.) Objectives:
Chapter 7 Point Estimation of Parameters. Learning Objectives Explain the general concepts of estimating Explain important properties of point estimators.
1 Standard error Estimated standard error,s,. 2 Example 1 While measuring the thermal conductivity of Armco iron, using a temperature of 100F and a power.
© 2006 by The McGraw-Hill Companies, Inc. All rights reserved. 1 Chapter 12 Testing for Relationships Tests of linear relationships –Correlation 2 continuous.
CJT 765: Structural Equation Modeling Class 8: Confirmatory Factory Analysis.
Item Factor Analysis Item Response Theory Beaujean Chapter 6.
Latent regression models. Where does the probability come from? Why isn’t the model deterministic. Each item tests something unique – We are interested.
1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.
Univariate Gaussian Case (Cont.)
Two Approaches to Estimation of Classification Accuracy Rate Under Item Response Theory Quinn N. Lathrop and Ying Cheng Assistant Professor Ph.D., University.
Hypothesis Testing and Statistical Significance
Lesson 5.1 Evaluation of the measurement instrument: reliability I.
Chapter 4 Variability PowerPoint Lecture Slides Essentials of Statistics for the Behavioral Sciences Seventh Edition by Frederick J Gravetter and Larry.
Generalizability Theory A Brief Introduction Greg Brown UCSD.
5. Evaluation of measuring tools: reliability Psychometrics. 2011/12. Group A (English)
Chapter 2 Norms and Reliability. The essential objective of test standardization is to determine the distribution of raw scores in the norm group so that.
Classical Test Theory Psych DeShon. Big Picture To make good decisions, you must know how much error is in the data upon which the decisions are.
Nonequivalent Groups: Linear Methods Kolen, M. J., & Brennan, R. L. (2004). Test equating, scaling, and linking: Methods and practices (2 nd ed.). New.
Probability Theory and Parameter Estimation I
Classical Test Theory Margaret Wu.
Likelihood Ratio, Wald, and Lagrange Multiplier (Score) Tests
Evaluation of measuring tools: reliability
More about Posterior Distributions
Chi-square and F Distributions
EC 331 The Theory of and applications of Maximum Likelihood Method
LIMITED DEPENDENT VARIABLE REGRESSION MODELS
Statistics II: An Overview of Statistics
LECTURE 09: BAYESIAN LEARNING
Mathematical Foundations of BME
Maximum Likelihood We have studied the OLS estimator. It only applies under certain assumptions In particular,  ~ N(0, 2 ) But what if the sampling distribution.
Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
Presentation transcript:

Comparison of Reliability Measures under Factor Analysis and Item Response Theory —Ying Cheng , Ke-Hai Yuan , and Cheng Liu Presented by Zhu Jinxin

Outline of the Presentation Introduction of four reliability coefficients: , , ,  and  The relationship among them Conclusion and discussion

Cronbach’s alpha One of the definitions is K is the number of components (items or testlets)  X 2 is the variance of the observed total test scores,  Yi 2 is the variance of component i for the current sample of persons.

Cronbach’s alpha’s feature It is most widely used Raw sum score is used  may underestimates reliability at population level, when the assumption of essential tau- equivalency is violated

about Tau-equivalency

In this case, the reliability is underestimated by  which is only a lower-bound estimate of the true reliability of scale when measures are congeneric.

 in congeneric measures in Single-factor model

Suppose we have m items

 in congeneric measures in Single-factor model Variance of true score Variance of unweighted composite score

feature of  1.It neglects that people with the same sum score can have completely deferent response patterns.  ≧  when

 in congeneric measures in Single-factor model ≧≧≧≧ when is  equal to  ?

Reliability in IRT The variance of the MLE is (approximately) given by the inverse of the information The variance of  is 1 in MLE, in which The study use information in a broader sense by equating it with the inverse of a variance even when the parameter estimate is not an MLE so

 from information perspective

 from information perspective

 from information perspective

Reliability in IRT With a single parameter, I, the information is defined as the negative expected value of the second derivative of the log likelihood function. The IRT models directly relate the discrete responses to an underlying latent factor. When q is normally distributed, the normal ogive IRT models are equivalent to the item factor analysis model.

Reliability in IRT For binary response Where id the response and Approximately

Reliability in IRT For binary response

Reliability in IRT For binary response The information is defined as the negative expected value of the second derivative of the log likelihood function: For each item For test

Reliability in IRT For binary response the reliability is and (the deduction is put in the appedix)

Reliability in IRT For response of ordered categories, supposing the continuous response to item j is discretized by g threshold. The information of jth item is given by

The relationship  ≧  ≧   It is expected that There is no dominant relationship between  (2) Simulation demonstrated that, as the number of response increase,  can exceed  in practice.

Conclusion Keep as many many response categories as possible and use ML factor score. However, after having a certain number of response options, it may not be worth adding more.

Discussion Only graded response (order categories) models is studied. (comparing to other types polytomous IRT models) Only unidimensional models are studied.

Thank you!