Multitrait Scaling and IRT: Part I Ron D. Hays, Ph.D. Questionnaire.

Slides:



Advertisements
Similar presentations
Estimating Reliability RCMAR/EXPORT Methods Seminar Series Drew (Cobb Room 131)/ UCLA (2 nd Floor at Broxton) December 12, 2011.
Advertisements

1 Regression as Moment Structure. 2 Regression Equation Y =  X + v Observable Variables Y z = X Moment matrix  YY  YX  =  YX  XX Moment structure.
Structural Equation Modeling
Reliability Definition: The stability or consistency of a test. Assumption: True score = obtained score +/- error Domain Sampling Model Item Domain Test.
© McGraw-Hill Higher Education. All rights reserved. Chapter 3 Reliability and Objectivity.
Measurement Reliability and Validity
Methods for Assessing Safety Culture: A View from the Outside October 2, 2014 (9:30 – 10:30 EDT) Safety Culture Conference (AHRQ Watts Branch Conference.
1 Evaluating Multi-Item Scales Using Item-Scale Correlations and Confirmatory Factor Analysis Ron D. Hays, Ph.D. RCMAR.
Evaluating Multi-Item Scales Health Services Research Design (HS 225A) November 13, 2012, 2-4pm (Public Health)
Developing the Research Question
GRA 6020 Multivariate Statistics The Structural Equation Model Ulf H. Olsson Professor of Statistics.
Session 3 Normal Distribution Scores Reliability.
LECTURE 16 STRUCTURAL EQUATION MODELING.
Measurement Joseph Stevens, Ph.D. ©  Measurement Process of assigning quantitative or qualitative descriptions to some attribute Operational Definitions.
Classical Test Theory By ____________________. What is CCT?
LECTURE 6 RELIABILITY. Reliability is a proportion of variance measure (squared variable) Defined as the proportion of observed score (x) variance due.
Primer on Evaluating Reliability and Validity of Multi-Item Scales Questionnaire Design and Testing Workshop October 25, 2013, 3:30-5:00pm Wilshire.
MEASUREMENT CHARACTERISTICS Error & Confidence Reliability, Validity, & Usability.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 14 Measurement and Data Quality.
Psychometrics William P. Wattles, Ph.D. Francis Marion University.
The ABC’s of Pattern Scoring Dr. Cornelia Orr. Slide 2 Vocabulary Measurement – Psychometrics is a type of measurement Classical test theory Item Response.
Tests and Measurements Intersession 2006.
Evaluating Health-Related Quality of Life Measures June 12, 2014 (1:00 – 2:00 PDT) Kaiser Methods Webinar Series Ron D.Hays, Ph.D.
Reliability & Agreement DeShon Internal Consistency Reliability Parallel forms reliability Parallel forms reliability Split-Half reliability Split-Half.
CJT 765: Structural Equation Modeling Class 8: Confirmatory Factory Analysis.
1 Finding and Verifying Item Clusters in Outcomes Research Using Confirmatory and Exploratory Analyses Ron D. Hays, PhD May 13, 2003; 1-4 pm; Factor
Measurement Models: Exploratory and Confirmatory Factor Analysis James G. Anderson, Ph.D. Purdue University.
Measurement Issues by Ron D. Hays UCLA Division of General Internal Medicine & Health Services Research (June 30, 2008, 4:55-5:35pm) Diabetes & Obesity.
Designs and Reliability Assessing Student Learning Section 4.2.
Copyright © 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Assessing Measurement Quality in Quantitative Studies.
Validity and Item Analysis Chapter 4.  Concerns what instrument measures and how well it does so  Not something instrument “has” or “does not have”
Multitrait Scaling and IRT: Part II Ron D. Hays, Ph.D. Questionnaire Design and Testing.
Evaluating Self-Report Data Using Psychometric Methods Ron D. Hays, PhD February 6, 2008 (3:30-6:30pm) HS 249F.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 15 Developing and Testing Self-Report Scales.
Measurement Theory in Marketing Research. Measurement What is measurement?  Assignment of numerals to objects to represent quantities of attributes Don’t.
Multi-item Scale Evaluation Ron D. Hays, Ph.D. UCLA Division of General Internal Medicine/Health Services Research
Multitrait Scaling and IRT: Part I Ron D. Hays, Ph.D. Questionnaire Design and Testing.
CJT 765: Structural Equation Modeling Class 8: Confirmatory Factory Analysis.
Experimental Research Methods in Language Learning Chapter 12 Reliability and Reliability Analysis.
Multitrait Scaling and IRT: Part I Ron D. Hays, Ph.D. Questionnaire Design and Testing.
Psychometric Evaluation of Questionnaire Design and Testing Workshop December , 10:00-11:30 am Wilshire Suite 710 DATA.
Evaluating Self-Report Data Using Psychometric Methods Ron D. Hays, PhD February 8, 2006 (3:00-6:00pm) HS 249F.
Reliability: Introduction. Reliability Session 1.Definitions & Basic Concepts of Reliability 2.Theoretical Approaches 3.Empirical Assessments of Reliability.
Reliability: Introduction. Reliability Session Definitions & Basic Concepts of Reliability Theoretical Approaches Empirical Assessments of Reliability.
Psychometric Analyses of Survey to Assess Parental Perceptions of Their Children’s Dental Care in the California Healthy Families Program Ron D. Hays October.
Factor Analysis Ron D. Hays, Ph.D. N208, Factor (3-6pm) February 23, 2005.
Discussion of Issues and Approaches Presented by Templin and Teresi
Overview of Item Response Theory Ron D. Hays November 14, 2012 (8:10-8:30am) Geriatrics Society of America (GSA) Pre-Conference Workshop on Patient- Reported.
Evaluating Multi-item Scales Ron D. Hays, Ph.D. UCLA Division of General Internal Medicine/Health Services Research HS237A 11/10/09, 2-4pm,
Evaluating Multi-Item Scales Health Services Research Design (HS 225B) January 26, 2015, 1:00-3:00pm CHS.
Lesson 2 Main Test Theories: The Classical Test Theory (CTT)
Chapter 2 Norms and Reliability. The essential objective of test standardization is to determine the distribution of raw scores in the norm group so that.
Evaluating Patient-Reports about Health
UCLA Department of Medicine
Evaluating Patient-Reports about Health
UCLA Department of Medicine
Evaluating Multi-Item Scales
assessing scale reliability
Evaluating IRT Assumptions
Final Session of Summer: Psychometric Evaluation
Reliability, validity, and scaling
By ____________________
Evaluating the Psychometric Properties of Multi-Item Scales
Evaluating Multi-item Scales
Multitrait Scaling and IRT: Part I
Evaluating Multi-item Scales
Evaluating Multi-Item Scales
Psychometric testing and validation (Multi-trait scaling and IRT)
Introduction to Psychometric Analysis of Survey Data
UCLA Department of Medicine
Presentation transcript:

Multitrait Scaling and IRT: Part I Ron D. Hays, Ph.D. Questionnaire Design and Testing Workshop 2 nd Floor Conference Room October 8, 2010 (3-5pm)

Multitrait Scaling Analysis Internal consistency reliability – Item convergence

Hypothetical Item-Scale Correlations

Measurement Error observed = true score + systematic error + random error (bias)

Cronbach’s Alpha Respondents (BMS) Items (JMS) Resp. x Items (EMS) Total Source df SSMS Alpha = = 1.8 =

Alpha for Different Numbers of Items and Homogeneity Number of Items (k) Average Inter-item Correlation ( r ) Alpha st = k * r 1 + (k -1) * r

Spearman-Brown Prophecy Formula alpha y = N alpha x 1 + (N - 1) * alpha x N = how much longer scale y is than scale x ) (

Example Spearman-Brown Calculations MHI-18 18/32 (0.98) (1+(18/32 –1)*0.98 = / = 0.96

Number of Items and Reliability for Three Versions of the Mental Health Inventory (MHI)

Reliability Minimum Standards 0.70 or above (for group comparisons) 0.90 or higher (for individual assessment)  SEM = SD (1- reliability) 1/2

Intraclass Correlation and Reliability ModelReliability Intraclass Correlation One-Way MS - MS MS - MS MS MS + (K-1)MS Two-Way MS - MS MS - MS Fixed MS MS + (K-1)MS Two-Way N (MS - MS ) MS - MS Random NMS +MS - MS MS + (K-1)MS + K (MS - MS )/N BMS JMS EMS BMS WMS BMS EMS BMS WMS BMS EMS BMS EMS BMS EMS JMS EMS WMS BMS EMS

Multitrait Scaling Analysis Internal consistency reliability – Item convergence Item discrimination

Hypothetical Multitrait/Multi-Item Correlation Matrix

Multitrait/Multi-Item Correlation Matrix for Patient Satisfaction Ratings Technical Interpersonal Communication Financial Technical 10.66*0.63†0.67† *0.54†0.50† * † * † *0.60†0.56† *0.58†0.57†0.23 Interpersonal *0.63† †0.58*0.61† †0.65*0.67† †0.57*0.60† *0.58† †0.48*0.46†0.24 Note – Standard error of correlation is Technical = satisfaction with technical quality. Interpersonal = satisfaction with the interpersonal aspects. Communication = satisfaction with communication. Financial = satisfaction with financial arrangements. *Item-scale correlations for hypothesized scales (corrected for item overlap). †Correlation within two standard errors of the correlation of the item with its hypothesized scale.

Confirmatory Factor Analysis Compares observed covariances with covariances generated by hypothesized model Statistical and practical tests of fit Factor loadings Correlations between factors Regression coefficients

Fit Indices Normed fit index: Non-normed fit index: Comparative fit index:  -  2 null model 2  2 null  2 model 2 - df nullmodel 2 null  df -1  - df 2 model  - 2 null df null 1 -

Latent Trait and Item Responses Latent Trait Item 1 Response P(X 1 =1) P(X 1 =0) 1 0 Item 2 Response P(X 2 =1) P(X 2 =0) 1 0 Item 3 Response P(X 3 =0) 0 P(X 3 =2) 2 P(X 3 =1) 1

Item Responses and Trait Levels Item 1 Item 2 Item 3 Person 1Person 2Person 3 Trait Continuum

Item Response Theory (IRT) IRT models the relationship between a person’s response Y i to the question (i) and his or her level of the latent construct  being measured by positing b ik estimates how difficult it is for the item (i) to have a score of k or more and the discrimination parameter a i estimates the discriminatory power of the item. If for one group versus another at the same level  we observe systematically different probabilities of scoring k or above then we will say that item i displays DIF

Item Characteristic Curves (2-Parameter Model)