MEASUREMENT: RELIABILITY Lu Ann Aday, Ph.D. The University of Texas School of Public Health.

Slides:



Advertisements
Similar presentations
Questionnaire Development
Advertisements

Agenda Levels of measurement Measurement reliability Measurement validity Some examples Need for Cognition Horn-honking.
Standardized Scales.
Measurement Concepts Operational Definition: is the definition of a variable in terms of the actual procedures used by the researcher to measure and/or.
Lecture 7: reliability & validity Aims & objectives –This lecture will explore a variety of techniques for ensuring that research is conducted with reliable.
Principles of Measurement Lunch & Learn Oct 16, 2013 J Tobon & M Boyle.
Reliability IOP 301-T Mr. Rajesh Gunesh Reliability  Reliability means repeatability or consistency  A measure is considered reliable if it would give.
Consistency in testing
Topics: Quality of Measurements
The Research Consumer Evaluates Measurement Reliability and Validity
© McGraw-Hill Higher Education. All rights reserved. Chapter 3 Reliability and Objectivity.
The Department of Psychology
Chapter 4 – Reliability Observed Scores and True Scores Error
4/25/2015 Marketing Research 1. 4/25/2015Marketing Research2 MEASUREMENT  An attempt to provide an objective estimate of a natural phenomenon ◦ e.g.
 A description of the ways a research will observe and measure a variable, so called because it specifies the operations that will be taken into account.
Reliability Analysis. Overview of Reliability What is Reliability? Ways to Measure Reliability Interpreting Test-Retest and Parallel Forms Measuring and.
Methods for Estimating Reliability
Measurement. Scales of Measurement Stanley S. Stevens’ Five Criteria for Four Scales Nominal Scales –1. numbers are assigned to objects according to rules.
Reliability n Consistent n Dependable n Replicable n Stable.
Reliability Analysis. Overview of Reliability What is Reliability? Ways to Measure Reliability Interpreting Test-Retest and Parallel Forms Measuring and.
A quick introduction to the analysis of questionnaire data John Richardson.
FOUNDATIONS OF NURSING RESEARCH Sixth Edition CHAPTER Copyright ©2012 by Pearson Education, Inc. All rights reserved. Foundations of Nursing Research,
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by.
Research Methods in MIS
FINAL REPORT: OUTLINE & OVERVIEW OF SURVEY ERRORS
Reliability, Validity, & Scaling
QUESTIONNAIRE DESIGN: GENERAL PRINCIPLES
Instrumentation.
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
McMillan Educational Research: Fundamentals for the Consumer, 6e © 2012 Pearson Education, Inc. All rights reserved. Educational Research: Fundamentals.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
MEASUREMENT: VALIDITY Lu Ann Aday, Ph.D. The University of Texas School of Public Health.
Research Methodology Lecture No :24. Recap Lecture In the last lecture we discussed about: Frequencies Bar charts and pie charts Histogram Stem and leaf.
1 Chapter 4 – Reliability 1. Observed Scores and True Scores 2. Error 3. How We Deal with Sources of Error: A. Domain sampling – test items B. Time sampling.
Counseling Research: Quantitative, Qualitative, and Mixed Methods, 1e © 2010 Pearson Education, Inc. All rights reserved. Basic Statistical Concepts Sang.
Tests and Measurements Intersession 2006.
Learning Objective Chapter 9 The Concept of Measurement and Attitude Scales Copyright © 2000 South-Western College Publishing Co. CHAPTER nine The Concept.
MEASUREMENT: SCALE DEVELOPMENT Lu Ann Aday, Ph.D. The University of Texas School of Public Health.
Research Methodology and Methods of Social Inquiry Nov 8, 2011 Assessing Measurement Reliability & Validity.
Experimental Research Methods in Language Learning Chapter 12 Reliability and Reliability Analysis.
Reliability n Consistent n Dependable n Replicable n Stable.
©2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Reliability performance on language tests is also affected by factors other than communicative language ability. (1) test method facets They are systematic.
Chapter 6 - Standardized Measurement and Assessment
Reliability a measure is reliable if it gives the same information every time it is used. reliability is assessed by a number – typically a correlation.
Reliability When a Measurement Procedure yields consistent scores when the phenomenon being measured is not changing. Degree to which scores are free of.
Language Assessment Lecture 7 Validity & Reliability Instructor: Dr. Tung-hsien He
Lesson 5.1 Evaluation of the measurement instrument: reliability I.
5. Evaluation of measuring tools: reliability Psychometrics. 2011/12. Group A (English)
Measurement and Scaling Concepts
ESTABLISHING RELIABILITY AND VALIDITY OF RESEARCH TOOLS Prof. HCL Rawat Principal UCON,BFUHS Faridkot.
Measurement Reliability
Reliability Analysis.
Ch. 5 Measurement Concepts.
Lecture 5 Validity and Reliability
Chapter 7 Cooper and Schindler
Assessment Theory and Models Part II
RELIABILITY OF QUANTITATIVE & QUALITATIVE RESEARCH TOOLS
Classical Test Theory Margaret Wu.
Introduction to Measurement
PSY 614 Instructor: Emily Bullock, Ph.D.
Instrumentation: Reliability Measuring Caring in Nursing
Evaluation of measuring tools: reliability
RESEARCH METHODS Lecture 18
By ____________________
Reliability Analysis.
The first test of validity
15.1 The Role of Statistics in the Research Process
Chapter 8 VALIDITY AND RELIABILITY
Presentation transcript:

MEASUREMENT: RELIABILITY Lu Ann Aday, Ph.D. The University of Texas School of Public Health

RELIABILITY: Definition Extent of random variation in answers to questions as a function of when they are asked (test- retest), who asked them (inter- rater), and the fact that a given question is one of a number of questions that could have been asked to measure the concept of interest (internal consistency).

RELIABILITY: Types Test-test reliability Inter-rater reliability Internal consistency reliability

RELIABILITY: Computation Requires repeated measures to estimate stability over time (test-retest) or equivalence across data gatherers (inter- rater) or across questions/ items intended to measure the same underlying concept (internal consistency).

RELIABILITY: Test-retest Definition: correlation between answers to same question by same respondent at two different points in time

RELIABILITY: Test-retest Factors affecting: Vague question wording Transient personal states, e.g., physical or mental Situational factors, e.g., presence of other people

RELIABILITY: Test-retest Computation: Compute correlation coefficient between answers to same question by same respondent at two different points in time: RespondentQ1, Time 1Q1, Time 2 1AgreeAgree 2AgreeAgree 3AgreeAgree 4Agree Disagree 5AgreeAgree

RELIABILITY: Test-retest Correlation coefficients: Interval: Pearson r Ordinal: Spearman rho Nominal: Chi-square-based measures of association Correlation desired:.70+

RELIABILITY: Test-retest Comparisons of means: Interval: paired t-test, repeated measures analysis of variance Advantages: more accurately take into account that the first and second measurements are not independent more directly compare the actual answers at the two points in time

RELIABILITY: Inter-rater Definition: correlation between answers to same question by same respondent obtained by different data gatherers at (approximately) the same point in time

RELIABILITY: Inter-rater Factors affecting: Lack of adequate interviewer training Lack of standardization of data collection protocols and procedures

RELIABILITY: Inter-rater Computation: Compute correlation coefficient between answers to same question by same respondent obtained by different data gatherers : RespondentQ1, Int. AQ1, Int. B 1BP=140/90 BP=140/90 2BP=150/80 BP=150/80 3BP=145/95BP=145/95 4BP=145/95BP=120/80 5BP=140/90 BP=140/90

RELIABILITY: Inter-rater Correlation coefficients: (correlation coefficients for 3+ data gatherers noted in parentheses): Interval: Pearson r (eta) Ordinal: Spearman rho (chi-square) Nominal: Kappa (chi-square) Correlation desired:.80+

RELIABILITY: Internal Consistency Definition: correlation between answers by same respondent to different questions about the same underlying concept (usually summarized in scales)

RELIABILITY: Internal Consistency Factors affecting: Number of different questions asked to capture the underlying concept Level of association (correlation) between answers the same respondents give to different questions about the concept

RELIABILITY: Internal Consistency Computation: Compute internal consistency (underlying correlation) coefficients between answers by same respondent to different questions about the same concept: RespondentQ1Q2Q3 Disagree 1AgreeDisagreeAgree Disagree 2AgreeDisagreeAgree Disagree 3AgreeDisagreeAgree 4Agree AgreeAgree Disagree 5AgreeDisagreeAgree

RELIABILITY: Internal Consistency Internal consistency coefficients Corrected item-total correlation* Split-half reliability coefficient Cronbach alpha coefficient Coefficient desired:.70+ (group).90+ (individual).40+ (corrected item-total)*

RELIABILITY: Internal Consistency Computation: Corrected item-total correlation Add up the scores for answers to different questions about the same concept to create a total score Subtract the score for answer to a given question from the total score to create item-specific “corrected” total scores Compute Pearson correlation coefficients between score for each of the items and corresponding “corrected” total score

RELIABILITY: Internal Consistency Computation: Split-half reliability coefficient Randomly divide a series of questions about the same concept into halves and add up the scores for answers to the questions in the respective halves Compute Spearman-Brown prophecy coefficient for correlation between the scores for each half, adjusting for the fact that the respective scores are based on only half the original number of items

RELIABILITY : Spearman-Brown prophecy adjustments Original alpha/ Scale length -/ x3x4x

RELIABILITY : Spearman-Brown prophecy formula Computation: k * r o /1 + [(k-1) * r o ] where, k = factor by which scale is increased or decreased r o = alpha based on original length Example: 2 *.70/1 + [(2-1) *.70] =.82

RELIABILITY: Cronbach alpha coefficient Computation: k * r a /1 + [(k-1) * r a ] where, k = number of items in the scale r a = average Pearson r between items Example: 10 *.32/1 + [(10-1) *.32] =.82

WHEN TO UNDERTAKE RELIABILITY ANALYSIS RELIABILITY/ DIMENSIONS TEST-RETESTINTER-RATERINTERNAL CONSISTENCY QUESTIONSConcerned about stability of wording Concerned about equivalence of data gatherers Constructing summary scales of attitudes or other abstract concepts STUDIESEsp. important in longitudinal or experimental designs Monitored, but not usually measured directly in surveys Esp. used in attitudinal surveys STAGESPilot test or pretest Pretest plus monitor in final study Pretest or final study

REFERENCES  DeVellis, Robert F. (2003). Scale Development: Theory and Applications. Second Edition. Thousand Oaks, CA: Sage. Ware, J.E., Jr., & Gandek, B., for the IQOLA Project (1998). Methods for testing data quality, scaling assumptions, and reliability: The IQOLA Project Approach. J. Clinical Epidemiology, 51 (11),