Methods for Estimating Reliability

Slides:



Advertisements
Similar presentations
Reliability IOP 301-T Mr. Rajesh Gunesh Reliability  Reliability means repeatability or consistency  A measure is considered reliable if it would give.
Advertisements

MEASUREMENT: RELIABILITY Lu Ann Aday, Ph.D. The University of Texas School of Public Health.
Topics: Quality of Measurements
Taking Stock Of Measurement. Basics Of Measurement Measurement: Assignment of number to objects or events according to specific rules. Conceptual variables:
Reliability and Validity checks S-005. Checking on reliability of the data we collect  Compare over time (test-retest)  Item analysis  Internal consistency.
Procedures for Estimating Reliability
Types of Reliability.
Reliability Definition: The stability or consistency of a test. Assumption: True score = obtained score +/- error Domain Sampling Model Item Domain Test.
MEASUREMENT CONCEPTS © 2012 The McGraw-Hill Companies, Inc.
Chapter 5 Reliability Robert J. Drummond and Karyn Dayle Jones Assessment Procedures for Counselors and Helping Professionals, 6 th edition Copyright ©2006.
© 2006 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Validity and Reliability Chapter Eight.
Chapter 4 – Reliability Observed Scores and True Scores Error
Assessment Procedures for Counselors and Helping Professionals, 7e © 2010 Pearson Education, Inc. All rights reserved. Chapter 5 Reliability.
1Reliability Introduction to Communication Research School of Communication Studies James Madison University Dr. Michael Smilowitz.
Defining, Measuring and Manipulating Variables. Operational Definition  The activities of the researcher in measuring and manipulating a variable. 
 A description of the ways a research will observe and measure a variable, so called because it specifies the operations that will be taken into account.
Reliability Analysis. Overview of Reliability What is Reliability? Ways to Measure Reliability Interpreting Test-Retest and Parallel Forms Measuring and.
MEQ Analysis. Outline Validity Validity Reliability Reliability Difficulty Index Difficulty Index Power of Discrimination Power of Discrimination.
CH. 9 MEASUREMENT: SCALING, RELIABILITY, VALIDITY
Measurement. Scales of Measurement Stanley S. Stevens’ Five Criteria for Four Scales Nominal Scales –1. numbers are assigned to objects according to rules.
-生醫統計期末報告- Reliability 學生 : 劉佩昀 學號 : 授課老師 : 蔡章仁.
Reliability and Validity of Research Instruments
Reliability n Consistent n Dependable n Replicable n Stable.
Reliability Analysis. Overview of Reliability What is Reliability? Ways to Measure Reliability Interpreting Test-Retest and Parallel Forms Measuring and.
Reliability and Validity Dr. Roy Cole Department of Geography and Planning GVSU.
MEASUREMENT. Measurement “If you can’t measure it, you can’t manage it.” Bob Donath, Consultant.
Reliability n Consistent n Dependable n Replicable n Stable.
Conny’s Office Hours will now be by APPOINTMENT ONLY. Please her at if you would like to meet with.
Measurement and Scales Validity & Reliability Error.
8-1 Chapter Eight MEASUREMENT. 8-2 Measurement Selecting observable empirical events Using numbers (0, 1, #, %) or symbols (M, F, etc.) to represent aspects.
Research Methods in MIS
Reliability of Selection Measures. Reliability Defined The degree of dependability, consistency, or stability of scores on measures used in selection.
Psychometrics Timothy A. Steenbergh and Christopher J. Devers Indiana Wesleyan University.
Business Research Method Measurement, Scaling, Reliability, Validity
Measurement of Variables: Scaling, Reliability, Validity
Instrumentation.
Data Analysis. Quantitative data: Reliability & Validity Reliability: the degree of consistency with which it measures the attribute it is supposed to.
McMillan Educational Research: Fundamentals for the Consumer, 6e © 2012 Pearson Education, Inc. All rights reserved. Educational Research: Fundamentals.
Reliability Chapter 3.  Every observed score is a combination of true score and error Obs. = T + E  Reliability = Classical Test Theory.
Research Methodology Lecture No :24. Recap Lecture In the last lecture we discussed about: Frequencies Bar charts and pie charts Histogram Stem and leaf.
1 Chapter 4 – Reliability 1. Observed Scores and True Scores 2. Error 3. How We Deal with Sources of Error: A. Domain sampling – test items B. Time sampling.
Counseling Research: Quantitative, Qualitative, and Mixed Methods, 1e © 2010 Pearson Education, Inc. All rights reserved. Basic Statistical Concepts Sang.
Tests and Measurements Intersession 2006.
Assessing Learners with Special Needs: An Applied Approach, 6e © 2009 Pearson Education, Inc. All rights reserved. Chapter 4:Reliability and Validity.
Reliability & Agreement DeShon Internal Consistency Reliability Parallel forms reliability Parallel forms reliability Split-Half reliability Split-Half.
Estimating Reliability Test-Retest Coefficient Parallel-Forms Coefficient Internal Consistency Coefficient Interrater (interobserver) Reliability © 2015.
1 Measurement and Data Collection  What and How?  Types of Scales Nominal Nominal Ordinal Ordinal Interval Interval Ratio Ratio.
Chapter 2: Behavioral Variability and Research Variability and Research 1. Behavioral science involves the study of variability in behavior how and why.
Class 9 Dependent Variables, Instructions/Literature Review
Reliability n Consistent n Dependable n Replicable n Stable.
©2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
MEASUREMENT: PART 1. Overview  Background  Scales of Measurement  Reliability  Validity (next time)
DENT 514: Research Methods
Reliability: Introduction. Reliability Session Definitions & Basic Concepts of Reliability Theoretical Approaches Empirical Assessments of Reliability.
Measurement Experiment - effect of IV on DV. Independent Variable (2 or more levels) MANIPULATED a) situational - features in the environment b) task.
Reliability a measure is reliable if it gives the same information every time it is used. reliability is assessed by a number – typically a correlation.
Classroom Assessment Chapters 4 and 5 ELED 4050 Summer 2007.
Reliability When a Measurement Procedure yields consistent scores when the phenomenon being measured is not changing. Degree to which scores are free of.
Language Assessment Lecture 7 Validity & Reliability Instructor: Dr. Tung-hsien He
5. Evaluation of measuring tools: reliability Psychometrics. 2011/12. Group A (English)
Quality instrument* Questions are determined by objectives Resist the temptation to ask questions that are interesting but not relevant to your hypothesis.
Measurement and Scaling Concepts
Class 9 Dependent Variables, Instructions/Literature Review Class 9 Dependent Variables, Instructions/Literature Review Chapters 13 Spring 2016.
Lecture 5 Validity and Reliability
Introduction to Measurement
PSY 614 Instructor: Emily Bullock, Ph.D.
Instrumentation: Reliability Measuring Caring in Nursing
Evaluation of measuring tools: reliability
The first test of validity
Presentation transcript:

Methods for Estimating Reliability Dr. Shahram Yazdani

Types of Reliability Inter-Rater or Inter-Observer Reliability: Used to assess the degree to which different raters or observers give consistent estimates of the same phenomenon Test-Retest Reliability: Used to assess the consistency of a measure from one time to another Parallel-Forms Reliability: Used to assess the consistency of the results of two tests constructed same way from the same content domain Internal Consistency Reliability: Used to assess the consistency of results across items within a test Dr. Shahram Yazdani

Interrater or Interobserver Reliability object or phenomenon observer 1 observer 2 = ? Dr. Shahram Yazdani

Inter-rater Reliability Statistics used Nominal/categorical data Kappa statistic Ordinal data Kendall’s tau to see if pairs of ranks for each of several individuals are related Two judges rate 20 elementary school children on an index of hyperactivity and rank order them Interval or ratio data Pearson r using data obtained from the hyperactivity index Dr. Shahram Yazdani

Test-Retest Reliability = Stability over Time test time 1 time 2 Dr. Shahram Yazdani

Test-Retest Reliability Statistics used Pearson r or Spearman rho Important caveat Correlation decreases over time because error variance INCREASES (and may change in nature) Closer in time the two scores were obtained, the more the factors which contribute to error variance are the same Dr. Shahram Yazdani

Parallel-Forms Reliability form B form A Stability Across Forms = time 1 time 2 Dr. Shahram Yazdani

Parallel Forms Reliability Statistic used Pearson r or Spearman rho Important caveat Even when randomly chosen, the two forms may not be truly parallel Dr. Shahram Yazdani

Internal consistency Internal consistency Average inter-item correlation Average item total correlation Split-half reliability Dr. Shahram Yazdani

Average Inter-item Correlation Definition: calculate correlation of each item (Pearson r) with all other items. Dr. Shahram Yazdani

Internal Consistency Reliability item 1 Average Inter-Item Correlation item 2 I1 I2 I3 I4 I5 I6 I1 I2 I3 I4 I5 I6 1.00 .89 1.00 .91 .92 1.00 .88 .93 .95 1.00 .84 .86 .92 .85 1.00 .88 .91 .95 .87 .85 1.00 item 3 test item 4 item 5 .90 item 6 Dr. Shahram Yazdani

Average Item Total Correlation Definition: calculate correlation of each item scores with total score. Dr. Shahram Yazdani

Internal Consistency Reliability Average Item-Total Correlation item 1 I1 I2 I3 I4 I5 I6 item 2 I1 I2 I3 I4 I5 I6 Total 1.00 .89 1.00 .91 .92 1.00 .88 .93 .95 1.00 .84 .86 .92 .85 1.00 .88 .91 .95 .87 .85 1.00 .84 .88 .86 .87 .83 .82 1.00 item 3 test item 4 item 5 item 6 .85 Dr. Shahram Yazdani

Split-half Reliability Definition: Randomly divide the test into two forms; calculate scores for Form A, B; calculate Pearson r as index of reliability Dr. Shahram Yazdani

Internal Consistency Reliability Split-Half Correlations item 1 item 1 item 3 item 4 item 2 .87 item 3 test item 4 item 5 item 2 item 5 item 6 item 6 Dr. Shahram Yazdani

Cronbach’s alpha & Kuder-Richardson-20 Measures the extent to which items on a test are homogeneous; mean of all possible split-half combinations Kuder-Richardson-20 (KR-20): for dichotomous data Cronbach’s alpha: for non-dichotomous data Dr. Shahram Yazdani

Internal Consistency Reliability Cronbach’s alpha () item 1 .87 item 1 item 3 item 4 item 2 item 5 item 6 .85 item 1 item 3 item 4 item 2 item 5 item 6 .91 item 1 item 3 item 4 item 2 item 5 item 6 item 2 item 3 test SH1 .87 SH2 .85 SH3 .91 SH4 .83 SH5 .86 ... SHn .85 item 4 item 5 item 6  = .85 Dr. Shahram Yazdani

Reducing Measurement Error pilot test your instruments -- get feedback from respondents train your interviewers or observers make observation/measurement as unobtrusive as possible double-check your data triangulate across several measures that might have different biases Dr. Shahram Yazdani

Validity vs Reliability Dr. Shahram Yazdani

Thank you ! Any Question ? Dr. Shahram Yazdani