Reliability When a Measurement Procedure yields consistent scores when the phenomenon being measured is not changing. Degree to which scores are free of.

Slides:



Advertisements
Similar presentations
Questionnaire Development
Advertisements

Reliability IOP 301-T Mr. Rajesh Gunesh Reliability  Reliability means repeatability or consistency  A measure is considered reliable if it would give.
Consistency in testing
Topics: Quality of Measurements
The Research Consumer Evaluates Measurement Reliability and Validity
Some (Simplified) Steps for Creating a Personality Questionnaire Generate an item pool Administer the items to a sample of people Assess the uni-dimensionality.
Procedures for Estimating Reliability
Types of Reliability.
Reliability Definition: The stability or consistency of a test. Assumption: True score = obtained score +/- error Domain Sampling Model Item Domain Test.
© McGraw-Hill Higher Education. All rights reserved. Chapter 3 Reliability and Objectivity.
The Department of Psychology
© 2006 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Validity and Reliability Chapter Eight.
Chapter 4 – Reliability Observed Scores and True Scores Error
Assessment Procedures for Counselors and Helping Professionals, 7e © 2010 Pearson Education, Inc. All rights reserved. Chapter 5 Reliability.
Lesson Six Reliability.
 A description of the ways a research will observe and measure a variable, so called because it specifies the operations that will be taken into account.
Reliability & Validity.  Limits all inferences that can be drawn from later tests  If reliable and valid scale, can have confidence in findings  If.
Reliability for Teachers Kansas State Department of Education ASSESSMENT LITERACY PROJECT1 Reliability = Consistency.
Methods for Estimating Reliability
Reliability, the Properties of Random Errors, and Composite Scores.
-生醫統計期末報告- Reliability 學生 : 劉佩昀 學號 : 授課老師 : 蔡章仁.
Testing 05 Reliability.
Reliability and Validity of Research Instruments
Reliability n Consistent n Dependable n Replicable n Stable.
Reliability Analysis. Overview of Reliability What is Reliability? Ways to Measure Reliability Interpreting Test-Retest and Parallel Forms Measuring and.
Reliability n Consistent n Dependable n Replicable n Stable.
Lesson Seven Reliability. Contents  Definition of reliability Definition of reliability  Indication of reliability: Reliability coefficient Reliability.
Research Methods in MIS
Reliability of Selection Measures. Reliability Defined The degree of dependability, consistency, or stability of scores on measures used in selection.
Classroom Assessment Reliability. Classroom Assessment Reliability Reliability = Assessment Consistency. –Consistency within teachers across students.
Technical Issues Two concerns Validity Reliability
Measurement and Data Quality
Validity and Reliability
Reliability, Validity, & Scaling
MEASUREMENT MODELS. BASIC EQUATION x =  + e x = observed score  = true (latent) score: represents the score that would be obtained over many independent.
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
Data Analysis. Quantitative data: Reliability & Validity Reliability: the degree of consistency with which it measures the attribute it is supposed to.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 14 Measurement and Data Quality.
Unanswered Questions in Typical Literature Review 1. Thoroughness – How thorough was the literature search? – Did it include a computer search and a hand.
Reliability Lesson Six
Instrumentation (cont.) February 28 Note: Measurement Plan Due Next Week.
Reliability: Introduction. Reliability Session 1.Definitions & Basic Concepts of Reliability 2.Theoretical Approaches 3.Empirical Assessments of Reliability.
1 Chapter 4 – Reliability 1. Observed Scores and True Scores 2. Error 3. How We Deal with Sources of Error: A. Domain sampling – test items B. Time sampling.
Tests and Measurements Intersession 2006.
Reliability, the Properties of Random Errors, and Composite Scores Week 7, Psych R. Chris Fraley
Chapter 2: Behavioral Variability and Research Variability and Research 1. Behavioral science involves the study of variability in behavior how and why.
Copyright © 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Assessing Measurement Quality in Quantitative Studies.
1 LANGUAE TEST RELIABILITY. 2 What Is Reliability? Refer to a quality of test scores, and has to do with the consistency of measures across different.
Reliability n Consistent n Dependable n Replicable n Stable.
©2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Reliability performance on language tests is also affected by factors other than communicative language ability. (1) test method facets They are systematic.
Reliability: Introduction. Reliability Session 1.Definitions & Basic Concepts of Reliability 2.Theoretical Approaches 3.Empirical Assessments of Reliability.
Reliability a measure is reliable if it gives the same information every time it is used. reliability is assessed by a number – typically a correlation.
Chapter 6 Norm-Referenced Reliability and Validity.
Language Assessment Lecture 7 Validity & Reliability Instructor: Dr. Tung-hsien He
Dr. Jeffrey Oescher 27 January 2014 Technical Issues  Two technical issues  Validity  Reliability.
Copyright © 2014 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 11 Measurement and Data Quality.
Chapter 6 Norm-Referenced Measurement. Topics for Discussion Reliability Consistency Repeatability Validity Truthfulness Objectivity Inter-rater reliability.
5. Evaluation of measuring tools: reliability Psychometrics. 2011/12. Group A (English)
Professor Jim Tognolini
Reliability Analysis.
RELIABILITY OF QUANTITATIVE & QUALITATIVE RESEARCH TOOLS
Classical Test Theory Margaret Wu.
Calculating Reliability of Quantitative Measures
PSY 614 Instructor: Emily Bullock, Ph.D.
Evaluation of measuring tools: reliability
Reliability Analysis.
The first test of validity
Psychological Measurement: Reliability and the Properties of Random Errors The last two lectures were concerned with some basics of psychological measurement:
Chapter 8 VALIDITY AND RELIABILITY
Presentation transcript:

Reliability When a Measurement Procedure yields consistent scores when the phenomenon being measured is not changing. Degree to which scores are free of “measurement error” Consistency of measurement

In broad sense, test reliability indicates the extent to which individual differences in test scores are attributable to ‘true’ differences in the characteristics under consideration. Reliability is concerned with the degree of consistency or agreement between two independently derived sets of scores. Technically reliability is the ration of the variance of the observed score on the shorter test and the variance of the long- test true score.

TYPES OF RELIABILITY (OR) METHODS OF RELIABILITY MEASUREMENT Test – Restest Reliability Alternate – Form Reliability or Parallel Forms Method 3. Split – Half Method or Odd – Even Split – Half Method 4. Kuder – Richardson Reliability

Test-Retest Stability Measure the same thing over and over to see if it always gives you the same result Does not work as well with paper and pencil surveys 4

Types of Reliability Test-retest Inter-rater reliability Intra-rater reliability Statistical measures 5

Inter- and Intra-Rater Reliability Inter- and intra-rater reliability equivalence parallel Inter-rater: Two different raters rate the same thing to see if getting similar results Intra-rater: Give the same survey to the same person a week apart to see if getting the same results Reliability coeff. ® is the correlation between the scores obtained by the same persons on two administrations of the test. 6 Measurement 6

Different aspects of reliability Inter-observer, Inter-item Inter-observer reliability Repeated measures by different observers on the same subject Especially important in coding open-ended questions Inter-item reliability Do the items in a composite measure correlate highly Cronbach’s alpha

Types of Reliability Internal consistency correlations amongst multiple items in a factor e.g., Cronbach’s Alpha (a) Test-retest reliability correlation between time 1 & time 2 e.g., Product-moment correlation (r) Internal consistency (single administration) and Test-reliability (multiple administrations) are based on classical test theory. A more advanced/refined reliability technique is based Item Response Theory, which involves examining the extent to which each item discriminates between individuals with high/low overall scores. For more info see: http://en.wikipedia.org/wiki/Reliability_(psychometric)

Reliability Interpretation <.6 = not reliable .6 = OK .7 = reasonably reliable .8 = good, strong reliability .9 = excellent, very reliable >.9 = potentially overly reliable or redundant measurement – this is subjective and whether a scale is overly reliable depends also on the nature what is being measured Rule of thumb - reliability coefficients should be over .70, up to approx. .90

Alternate – Form Reliability or Parallel Forms Method Same set of persons can thus be tested with one form on the first occasion and with another equivalent form on the other. Correlation between the scores obtained on the two forms represent the reliability coefficient

Split – Half Method or Odd – Even Split – Half Method Test is divided into two equal parts. The total score for both the halves is calculated. Provides a measure of consistency with regard to content sampling

Kuder – Richardson Method KR20 = r = N ( S2 - Σ pq ) N – 1 S2 Where KR20 = the reliability estimate (r) N = the number of items on the test S2 = the variance of the total test score P = the proportion of people getting each item correct (this is found separately for each item) q = the proportion of people getting each item incorrect . For each item, q equals 1 – p. Σ pq = the sum of the products of p times q for each item on the test.

Problem : 1 Compute Internal Consistency Reliability for the 6 – item test, for which the following data has been collected on 10 respondents. Each right answer on the item is scored 1 and Each wrong answer is scored 0 Respondent Item Score Total Score 1 2 3 4 5 6 7 8 9 10

No of respondents passing the item Item Score Total Score (x – x) (x – x) 2 1 2 3 4 5 6 0.6 0.36 - 0.4 0.16 -0.4 - 2.4 5.76 - 1.4 1.96 7 8 9 10 1.6 2.56 No of respondents passing the item Σx = 44 X = Σx n 44 =4.4 Σ(x – x)2 = 12.4 varaince =Σ(x – x)2 12.4=1.24

No of respondents failing item 2 4 3 1 P 0.8 0.6 0.7 0.9 q 0.2 0.4 0.3 0.1 Pq 0.16 0.24 0.21 0.09 = 1.1 rtt = N [ S2 – Σpq ] N – 1 S2 = 6 [ 1.24 – 1.1 ] 6 – 1 1.24 = 1.2 * 0.113 . ˙ . Rtt = 0.14

Problem : 2 Respondent Item Score Total Score 1 2 3 4 5 6 7 8 9 10

No of respondent passing the item Item Score Total Score (x – x) (x – x)2 1 2 3 4 5 6 1.6 2.56 0.6 0.36 - 0.4 0.16 - 2.4 5.76 7 8 9 10 No of respondent passing the item Σx = 44 X = Σx n = 44 =4.4 Σ(x – x)2 = 18.4 varaince =Σ(x – x)2 =18.4=1.84

No of respondents failing item 3 2 1 4 P 0.7 0.8 0.9 0.6 q 0.3 0.2 0.1 0.4 Pq 0.21 0.16 0.09 0.24 rtt = N [ S2 – Σpq ] N – 1 S2 = 6 [ 1.84 – 1.12 ] 6 – 1 1.84 = 1.2 * 0.3913 . ˙ . rtt = 0.47

Problem : 3 Respondent Item Score Total Score 1 2 3 4 5 6 7 8 9 10 rtt = 0.89

ODD – EVEN SPLIT- HALF RELIABILITY Problem 4 Compute the ODD-EVEN “SPLIT-HALF” reliability for the 6 item test for which the following data has been collected for 10 respondents Respondent No. Item Score 1 2 3 4 5 6 7 8 9 10

Solution Respondent No. Item Score Total Odd Even 1 2 3 4 5 6 T (T)2 O 25 9 36 16 7 8 10 34 132 38 18

Variance Var ( Total ) = ΣT2 – ( ΣT )2 N N = 132 – ( 34 )2 = 1.64 10 10 Var ( Odd ) = ΣO2 – ( ΣO )2 N N = 38 – ( 16 )2 = 1.24 10 10 Var ( Even ) = ΣE2 – ( ΣE )2 = 38 – ( 18 )2 = 0.56 10 10

∞ = 2 [ Var.Total – ( Var. Odd + Var. Even ) ] Var.Total Cronbach’s Alpha Co efficient ∞ = 2 [ Var.Total – ( Var. Odd + Var. Even ) ] Var.Total = 2 [ 1.64 – ( 1.24 + 0.56) ] 1.64 ∞ = ( -) 0.2 . ˙ . Here, there is low negative reliability. Spearman's Browns Formula . ˙ . rtt = 2r12 = 2( - 0.2 ) = - 0.4 = ( - ) 0.50 1 + r12 1 – 0.2 0.8

Problem 5 Respondent No. Item Score 1 2 3 4 5 6 7 8 9 10

Solution Respondent No. Item Score Total Odd Even 1 2 3 4 5 6 7 8 T 64 16 49 9 25 36 10 56 364 29 97 27 89

Variance Var ( Total ) = ΣT2 – ( ΣT )2 N N = 364 – ( 56 )2 = 5.04 10 10 Var ( Odd ) = ΣO2 – ( ΣO )2 N N = 97 – ( 29 )2 = 1.29 10 10 Var ( Even ) = ΣE2 – ( ΣE )2 = 89 – ( 27 )2 = 1.61 10 10

∞ = 2 [ Var.Total – ( Var. Odd + Var. Even ) ] Var.Total Cronbach’s Alpha Co efficient ∞ = 2 [ Var.Total – ( Var. Odd + Var. Even ) ] Var.Total = 2 [ 5.04 – ( 1.29 + 1.61) ] 5.04 ∞ = 0.85 . ˙ . High Positive reliability. Spearman's Browns Formula . ˙ . rtt = 2r12 = 2 * 0.85 = 1.7 = 0.9189 = 0.92 1 + r12 1 + 0.85 1.85

KR21 ( If KR20 & odd. Even cannot be used or no KR21 ( If KR20 & odd. Even cannot be used or no. are not in ‘0’ & ‘1’ Form) Problem 6 Respondent Item Score 1 2 3 4 5 6 7 8 9 10

Solution 18 19 17 14 24 21 187 60.1 Respondent Item Score Total Score x (x – x) (x – x) 2 1 2 3 4 5 6 18 - 0.7 0.49 19 0.3 0.09 17 - 1.7 2.89 14 - 4.7 22.09 7 8 24 5.3 28.09 9 21 2.3 5.29 10 Total (y) 36 27 33 30 28 187 60.1 ( y – y ) 4.83 - 4.16 1.83 - 1.16 - 3.16 ( y – y ) 2 23.36 17.36 3.36 1.36 10.03

( x = Σx ) x = 187 =18.7 n 10 ( y = Σ y ) y = 187 = 31.16 N 6 n = 10 Var (y) = Σ (y – y) 2 = 58.83 = 9.81 Var (x) = Σ (x – x) 2 = 60.1 = 6.01 n 10 KR21 = N [ Var x – Var y ] N – 1 Var x = 6 [ 6.01 – 9.81 ] 5 6.01 = ( - ) 0.76 . ˙ . Negative high reliability

Since the reliability is poor one has to expand the test i. e Since the reliability is poor one has to expand the test i.e. add new items in the test. Existing test contains =6 items K=6 , r= -.76 To increase internal consistency to 0.5 one should add more no. of items to the test Kd= rd/1-rd K r/1-r

Problem 7 Respondent Item Score 1 2 3 4 5 6 7 8 9 10