UCLA Department of Medicine

Slides:



Advertisements
Similar presentations
Introduction to IRT/Rasch Measurement with Winsteps Ken Conrad, University of Illinois at Chicago Barth Riley and Michael Dennis, Chestnut Health Systems.
Advertisements

Social And Behavioral Determinants of Health Ron D. Hays, Ph.D. (UCLA) February 6, 2014 (8:40-9:35 am session) Institute of Medicine Committee on Recommended.
Item Response Theory in a Multi-level Framework Saralyn Miller Meg Oliphint EDU 7309.
Reliability Definition: The stability or consistency of a test. Assumption: True score = obtained score +/- error Domain Sampling Model Item Domain Test.
Part II Sigma Freud & Descriptive Statistics
Part II Sigma Freud & Descriptive Statistics
Item Response Theory in Health Measurement
How does PROMIS compare to other HRQOL measures? Ron D. Hays, Ph.D. UCLA, Los Angeles, CA Symposium 1077: Measurement Tools to Enhance Health- Related.
PROMIS DEVELOPMENT METHODS, ANALYSES AND APPLICATIONS Presented at the Patient-Reported Outcomes Measurement Information System (PROMIS): A Resource for.
Remaining Challenges and What to Do Next: Undiscovered Areas Ron D. Hays UCLA Division of General Internal Medicine & Health Services Research
Why Patient-Reported Outcomes Are Important: Growing Implications and Applications for Rheumatologists Ron D. Hays, Ph.D. UCLA Department of Medicine RAND.
Methods for Assessing Safety Culture: A View from the Outside October 2, 2014 (9:30 – 10:30 EDT) Safety Culture Conference (AHRQ Watts Branch Conference.
1 Evaluating Multi-Item Scales Using Item-Scale Correlations and Confirmatory Factor Analysis Ron D. Hays, Ph.D. RCMAR.
Item Response Theory in Patient-Reported Outcomes Research Ron D. Hays, Ph.D., UCLA Society of General Internal Medicine California-Hawaii Regional Meeting.
There is a need to validate existing patient reported outcome measures (PROMS) to provide evidence on their validity. Subsequently, short measures can.
Item Analysis: Classical and Beyond SCROLLA Symposium Measurement Theory and Item Analysis Modified for EPE/EDP 711 by Kelly Bradley on January 8, 2013.
Primer on Evaluating Reliability and Validity of Multi-Item Scales Questionnaire Design and Testing Workshop October 25, 2013, 3:30-5:00pm Wilshire.
Use of Health-Related Quality of Life Measures to Assess Individual Patients July 24, 2014 (1:00 – 2:00 PDT) Kaiser Permanente Methods Webinar Series Ron.
SAS PROC IRT July 20, 2015 RCMAR/EXPORT Methods Seminar 3-4pm Acknowledgements: - Karen L. Spritzer - NCI (1U2-CCA )
Tests and Measurements Intersession 2006.
Measurement Issues by Ron D. Hays UCLA Division of General Internal Medicine & Health Services Research (June 30, 2008, 4:55-5:35pm) Diabetes & Obesity.
1 Should We Care about What Patients Say About Coordination of Care? Ron D. Hays, Ph.D. UCLA Department of Medicine RAND.
Item Response Theory (IRT) Models for Questionnaire Evaluation: Response to Reeve Ron D. Hays October 22, 2009, ~3:45-4:05pm
Evaluating Self-Report Data Using Psychometric Methods Ron D. Hays, PhD February 6, 2008 (3:30-6:30pm) HS 249F.
Multi-item Scale Evaluation Ron D. Hays, Ph.D. UCLA Division of General Internal Medicine/Health Services Research
Multitrait Scaling and IRT: Part I Ron D. Hays, Ph.D. Questionnaire Design and Testing.
Psychometric Evaluation of Questionnaire Design and Testing Workshop December , 10:00-11:30 am Wilshire Suite 710 DATA.
Item Response Theory in Health Measurement
Ron D. Hays, Ph.D. Joshua S. Malett, Sarah Gaillot, and Marc N. Elliott Physical Functioning Among Medicare Beneficiaries International Society for Quality.
Item Analysis: Classical and Beyond SCROLLA Symposium Measurement Theory and Item Analysis Heriot Watt University 12th February 2003.
Item Response Theory Dan Mungas, Ph.D. Department of Neurology
Considerations in Comparing Groups of People with PROs Ron D. Hays, Ph.D. UCLA Department of Medicine May 6, 2008, 3:45-5:00pm ISPOR, Toronto, Canada.
Item Response Theory Dan Mungas, Ph.D. Department of Neurology University of California, Davis.
Patient-Reported Physical Functioning Ron D. Hays November 27, 2012 (11:15-11:30) UCLA Department of Medicine MCID for Orthopaedic Devices Silver Springs,
Multitrait Scaling and IRT: Part I Ron D. Hays, Ph.D. Questionnaire.
Overview of Item Response Theory Ron D. Hays November 14, 2012 (8:10-8:30am) Geriatrics Society of America (GSA) Pre-Conference Workshop on Patient- Reported.
Evaluating Multi-item Scales Ron D. Hays, Ph.D. UCLA Division of General Internal Medicine/Health Services Research HS237A 11/10/09, 2-4pm,
Health-Related Quality of Life in Outcome Studies Ron D. Hays, Ph.D UCLA Division of General Internal Medicine & Health Services Research GCRC Summer Session.
Test-Retest Reliability of the Work Disability Functional Assessment Battery (WD-FAB) Dr. Leighton Chan, MD, MPH Chief, Rehabilitation Medicine Department.
Copyright © 2014 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 25 Critiquing Assessments Sherrilene Classen, Craig A. Velozo.
Evaluating Patient-Reports about Health
Ron D. Hays September 12, 2006 Network Testing ~ 5 or 6pm
Psychometric Evaluation of Items Ron D. Hays
Further Validation of the Personal Growth Initiative Scale – II: Gender Measurement Invariance Harmon, K. A., Shigemoto, Y., Borowa, D., Robitschek, C.,
QDET2, Miami, FL, Hibiscus A
NIH: Patient-Reported Outcomes Measurement Information System (PROMIS®) Ron D. Hays Functional Vision and Visual Function November 10, 2016, 8:55-9:15am.
Background and Context Research Question and Proposition
Evaluating Patient-Reports about Health
UCLA Department of Medicine
Evaluating Multi-Item Scales
PROMIS-29 V2.0 Physical and Mental Health Summary Scores Ron D. Hays
Ron D. Hays GIM-HSR Friday Noon Seminar Series November 4, 2016
Classical Test Theory Margaret Wu.
Item Analysis: Classical and Beyond
Evaluating IRT Assumptions
Final Session of Summer: Psychometric Evaluation
First study published in JOGS.
Psychology Statistics
Item Analysis: Classical and Beyond
Evaluating the Psychometric Properties of Multi-Item Scales
Evaluating Multi-item Scales
Multitrait Scaling and IRT: Part I
Evaluating Multi-item Scales
Item Analysis: Classical and Beyond
Item Response Theory Applications in Health Ron D. Hays, Discussant
Psychometric testing and validation (Multi-trait scaling and IRT)
Introduction to Psychometric Analysis of Survey Data
Evaluating the Significance of Individual Change
UCLA Department of Medicine
Patient-reported Outcome Measures
Presentation transcript:

UCLA Department of Medicine What can do? Ron D. Hays, Ph.D. UCLA Department of Medicine http://gim.med.ucla.edu/FacultyPages/Hays/ March 1, 2016 12 noon – 1pm Center for Healthful Behavior Change Translation Research Building 619/NYU School of Medicine Department of Population Health 227 East 30th St. 6th Floor, NY, NY 10016

People and Items on Same z-score metric Person 1 Person 2 Person 3 -3 3

-3 -3 3

Item Response Theory (IRT) IRT graded response model estimates relationship between a person’s response Yi to the question (i) and his or her level on the latent construct (): bik estimates how difficult it is to have a score of k or more on item (i). ai estimates item discrimination.

IRT is mainstreaming BIGSTEPS and WINSTEPS PARSCALE and MULTILOG IRTPRO and FLEXMIRT SAS and STATA

Computer Adaptive Testing (CAT) 2004 www.nihpromis.org

Reliability Target for Use of Measures with Individuals z-score (mean = 0, SD = 1) Reliability ranges from 0-1 0.90 or above is goal SE = SD (1- reliability)1/2 Reliability = 1 – SE2 Reliability = 0.90 when SE = 0.32 95% CI = true score +/- 1.96 x SE (CI = -0.63  0.63 z-score when reliability = 0.90) T = z*10 + 50

PROMIS Physical Functioning vs. “Legacy” Measures 10 20 30 40 50 60 70

DIF (2-parameter model) Men Women White Slope DIF Location DIF AA I cry when upset I get sad for no reason Higher Score = More Depressive Symptoms 9

Person Fit Large negative ZL values indicate misfit. One person in PROMIS project had ZL = -3.13 This person reported that they could do 13 physical functioning activities (including running 5 miles) without any difficulty, but This person reported a little difficulty being out of bed for most of the day.

My exposure to IRT (1990’s) . Hays, R. D., & Reise, S. P. (1998, November). Item response theory. Invited Workshop, International Society for Quality of Life Research, Baltimore, MD Hays, R. D., Morales, L. S., & Reise, S. P. (2000). Item Response Theory and Health Outcomes Measurement in the 21st Century. Medical Care, 38 (Suppl.), II-28-II-42.

Physical Functioning Ability to conduct a variety of activities ranging from self-care to running 6 physical functioning items included in 2010 Consumer Assessment of Healthcare Providers and Systems (CAHPS®) Medicare Survey Hays, R. D., Mallett, J. S., Gaillot, S., & Elliott, M. N. (2015 ). Performance of the Medicare Consumer Assessment of Healthcare Providers and Systems (CAHPS®) Physical Functioning Items. Medical Care, 54, 205-209

Because of a health or physical problem are you unable to do or have any difficulty doing the following activities? Walking? Getting in or out of chairs? Bathing? Dressing? Using the toilet? Eating? I am unable to do this activity (0) Yes, I have difficulty (1) No, I do not have difficulty (2)

Medicare beneficiary sample (n = 366,701) 58% female 57% high school education or less 14% 18-64; 48% 65-74, 29% 75-84, 9% 85+

Possible 6-item scale range: 0-12 (2% floor, 65% ceiling) (Some difficulty) (1/3) (1/5) (1/7) (1/9) (1/10) (1/16) Possible 6-item scale range: 0-12 (2% floor, 65% ceiling)

Item-Scale Correlations Walking (0, 1, 2) 0.71 Chairs (0, 1, 2) 0.80 Bathing (0, 1, 2) 0.83 Dressing (0, 1, 2) 0.86 Toileting (0, 1, 2) 0.84 Eating (0, 1, 2) 0.75 Possible 6-item scale range: 0-12 (2% floor, 65% ceiling)

Reliability Formulas Model Reliability Intraclass Correlation Two-way random Two-way mixed One-way BMS = Between Ratee Mean Square N = n of ratees WMS = Within Mean Square k = n of items or raters JMS = Item or Rater Mean Square EMS = Ratee x Item (Rater) Mean Square 18

Internal Consistency Reliability (Coefficient Alpha) (MSbms – MSems)/MSbms Ordinal alpha = 0.98 http://support.sas.com/resources/papers/proceedings14/2042-2014.pdf http://gim.med.ucla.edu/FacultyPages/Hays/utils/

Confirmatory Factor Analysis (Polychoric* Correlations) Dressing Eating Bathing Walking Chairs *Estimated correlation between two underlying normally distributed continuous variables Toileting Residual correlations <= 0.04

Item Characteristic Curves Unable to do I do not have difficulty Unable to do Have difficulty Have difficulty I do not have difficulty Unable to do I do not have difficulty Unable to do I do not have difficulty Have difficulty Have difficulty Unable to do I do not have difficulty I do not have difficulty Unable to do Have difficulty Have difficulty

Unable to do Have difficulty

Reliability = (Info – 1) / Info

Item Characteristic Curve for Emotional Health Scale Very little Very little Very little Very much Very much Very much No No No Very little Very little Very much Very much No No

IRT Distortions “Longer tests are more reliable than shorter tests vs. Shorter tests can be more reliable than longer tests.” The new rules of measurement, Psychological Assessment, 1996, Susan E. Embretson “Parameter values are identical in separate subgroups or across different measurement conditions.” It is the often misunderstood feature of parameter invariance that is frequently cited in introductory or advanced texts” (Rupp & Zumbo, 2006).

Ben Wright or Been Wrong? “Modern day psychometric analyses such as Rasch analysis convert ordinal data to an interval scale so that response scores meet the criteria for measurement” “Application of the Rasch model to the data set estimates a measure that can be considered valid.” The “Rasch model is the only valid approach to measurement” Bergan, 2013, Rasch versus Birnbaum: New arguments in an old debate (p. 3)

Questions? 28