Topics: Quality of Measurements

Slides:



Advertisements
Similar presentations
Measurement Concepts Operational Definition: is the definition of a variable in terms of the actual procedures used by the researcher to measure and/or.
Advertisements

Reliability IOP 301-T Mr. Rajesh Gunesh Reliability  Reliability means repeatability or consistency  A measure is considered reliable if it would give.
RELIABILITY Reliability refers to the consistency of a test or measurement. Reliability studies Test-retest reliability Equipment and/or procedures Intra-
Reliability Definition: The stability or consistency of a test. Assumption: True score = obtained score +/- error Domain Sampling Model Item Domain Test.
© McGraw-Hill Higher Education. All rights reserved. Chapter 3 Reliability and Objectivity.
Chapter 5 Reliability Robert J. Drummond and Karyn Dayle Jones Assessment Procedures for Counselors and Helping Professionals, 6 th edition Copyright ©2006.
The Department of Psychology
© 2006 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Validity and Reliability Chapter Eight.
Chapter 4 – Reliability Observed Scores and True Scores Error
Assessment Procedures for Counselors and Helping Professionals, 7e © 2010 Pearson Education, Inc. All rights reserved. Chapter 5 Reliability.
VALIDITY AND RELIABILITY
Lesson Six Reliability.
 A description of the ways a research will observe and measure a variable, so called because it specifies the operations that will be taken into account.
Reliability for Teachers Kansas State Department of Education ASSESSMENT LITERACY PROJECT1 Reliability = Consistency.
Measurement the process by which we test hypotheses and theories. assesses traits and abilities by means other than testing obtains information by comparing.
Methods for Estimating Reliability
-生醫統計期末報告- Reliability 學生 : 劉佩昀 學號 : 授課老師 : 蔡章仁.
Reliability and Validity of Research Instruments
Reliability n Consistent n Dependable n Replicable n Stable.
Reliability Analysis. Overview of Reliability What is Reliability? Ways to Measure Reliability Interpreting Test-Retest and Parallel Forms Measuring and.
Lesson Seven Reliability. Contents  Definition of reliability Definition of reliability  Indication of reliability: Reliability coefficient Reliability.
Research Methods in MIS
Classroom Assessment A Practical Guide for Educators by Craig A
Reliability of Selection Measures. Reliability Defined The degree of dependability, consistency, or stability of scores on measures used in selection.
Classroom Assessment Reliability. Classroom Assessment Reliability Reliability = Assessment Consistency. –Consistency within teachers across students.
Measurement Concepts & Interpretation. Scores on tests can be interpreted: By comparing a client to a peer in the norm group to determine how different.
Measurement and Data Quality
Validity and Reliability
Foundations of Educational Measurement
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
MEASUREMENT CHARACTERISTICS Error & Confidence Reliability, Validity, & Usability.
Data Analysis. Quantitative data: Reliability & Validity Reliability: the degree of consistency with which it measures the attribute it is supposed to.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 14 Measurement and Data Quality.
McMillan Educational Research: Fundamentals for the Consumer, 6e © 2012 Pearson Education, Inc. All rights reserved. Educational Research: Fundamentals.
Unanswered Questions in Typical Literature Review 1. Thoroughness – How thorough was the literature search? – Did it include a computer search and a hand.
Reliability Lesson Six
LECTURE 06B BEGINS HERE THIS IS WHERE MATERIAL FOR EXAM 3 BEGINS.
Reliability: Introduction. Reliability Session 1.Definitions & Basic Concepts of Reliability 2.Theoretical Approaches 3.Empirical Assessments of Reliability.
1 Chapter 4 – Reliability 1. Observed Scores and True Scores 2. Error 3. How We Deal with Sources of Error: A. Domain sampling – test items B. Time sampling.
Tests and Measurements Intersession 2006.
Independent vs Dependent Variables PRESUMED CAUSE REFERRED TO AS INDEPENDENT VARIABLE (SMOKING). PRESUMED EFFECT IS DEPENDENT VARIABLE (LUNG CANCER). SEEK.
RELIABILITY Prepared by Marina Gvozdeva, Elena Onoprienko, Yulia Polshina, Nadezhda Shablikova.
Copyright © 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Assessing Measurement Quality in Quantitative Studies.
SOCW 671: #5 Measurement Levels, Reliability, Validity, & Classic Measurement Theory.
Measurement MANA 4328 Dr. Jeanne Michalski
1 LANGUAE TEST RELIABILITY. 2 What Is Reliability? Refer to a quality of test scores, and has to do with the consistency of measures across different.
Reliability n Consistent n Dependable n Replicable n Stable.
©2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Reliability: Introduction. Reliability Session 1.Definitions & Basic Concepts of Reliability 2.Theoretical Approaches 3.Empirical Assessments of Reliability.
Reliability and Validity Themes in Psychology. Reliability Reliability of measurement instrument: the extent to which it gives consistent measurements.
Reliability Ability to produce similar results when repeated measurements are made under identical conditions. Consistency of the results Can you get.
Chapter 6 - Standardized Measurement and Assessment
Reliability and Validity in Testing. What is Reliability? Consistency Accuracy There is a value related to reliability that ranges from -1 to 1.
Reliability When a Measurement Procedure yields consistent scores when the phenomenon being measured is not changing. Degree to which scores are free of.
Chapter 6 Norm-Referenced Reliability and Validity.
Language Assessment Lecture 7 Validity & Reliability Instructor: Dr. Tung-hsien He
Copyright © 2014 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 11 Measurement and Data Quality.
Chapter 6 Norm-Referenced Measurement. Topics for Discussion Reliability Consistency Repeatability Validity Truthfulness Objectivity Inter-rater reliability.
5. Evaluation of measuring tools: reliability Psychometrics. 2011/12. Group A (English)
ESTABLISHING RELIABILITY AND VALIDITY OF RESEARCH TOOLS Prof. HCL Rawat Principal UCON,BFUHS Faridkot.
Reliability Analysis.
Reliability.
Classical Test Theory Margaret Wu.
Reliability & Validity
PSY 614 Instructor: Emily Bullock, Ph.D.
Evaluation of measuring tools: reliability
MANA 5341 Dr. George Benson Measurement MANA 5341 Dr. George Benson 1.
Reliability Analysis.
The first test of validity
Chapter 8 VALIDITY AND RELIABILITY
Presentation transcript:

Topics: Quality of Measurements Reliability Validity

The Quality of Measuring Instruments: Definitions Reliability: Consistency - the extent to which the data are consistent Validity: Accuracy- the extent to which the instrument measures what it purports to measure

Hitting the Bull’s Eye

The Questions of Reliability To what degree does a subject’s measured performance remain consistent across repeated testings? How consistently will results be reproduced if we measure the same individuals again? What is the equivalence of results of two measurement occasions using “parallel” tests? To what extent do the individual items that go together to make up a test or inventory consistently measure the same underlying characteristic? How much consistency exists among the ratings provided by a group of raters? When we have obtained a score, how precise is it?

True and Error Score Parallel Tests

Sources of Error: Conditions of Test Administration and Construction Changes in time limits Changes in directions Different scoring procedures Interrupted testing session Qualities of test administrator Time test is taken Sampling of items Ambiguity in wording of items/questions Ambiguous directions Climate of test situation (heating, light, ventilation, etc) Differences in observers

Sources of Error: Conditions of the Person Taking the Test Reaction to specific items Health Motivation Mood Fatigue Luck Memory and/or attention fluctuations Attitudes Test-taking skills (test-wiseness) Ability to understand instructions Anxiety

Reliability Reliability: ratio of true variance to observed variance Reliability coefficient: a numerical index which assumes a value between 0 and +1.00

Relation between Reliability and Error True-Score Variability Error True-Score Variability Error Reliable Measure (A) Unreliable Measure (B)

Methods of Estimating Reliablity Test-Retest: Repeated measures with the same test (coefficient of stability) Parallel Forms: Repeated measures with equivalent forms of a test (coefficient of equivalence) Internal Consistency: Repeated measures using items on a single test Inter-Rater: Judgments by more than one rater.

Reliability Is The Consistency Of A Measurement Repeated Measurements/Observations Person X1 X2 X3 . . . Xk-->infinity Charlie 20 19 21 . . . 20 Harry 15 17 16 . . . 16 Reliable Repeated Measurements/Observations Person X1 X2 X3 . . . Xk-->infinity Charlie 20 10 8 . . . 23 Harry 2 11 4 . . . 15 Unreliable

Test-Retest Reliability Situation: Same people taking two administrations of the same test Procedure: Correlate scores on the two tests which yields the coefficient of stability Meaning: the extent to which scores on a test can be generalized over different occasions (temporal stability). Appropriate use: Information about the stability of the trait over time.

Parallel (Alternate)Forms Reliability Situation: Testing of same people on different but comparable forms of the test Procedure: correlate the scores from the two tests which yields a coefficient of equivalence Meaning: the consistency of response to different item samples (where testing is immediate) and across occasions (where testing is delayed). Appropriate use: to provide information about the equivalence of forms

Internal Consistency Reliability Situation: a single administration of one test form Procedure: Divide test into comparable halves and correlate scores from both halves. Split Half with Spearman Brown adjustment Kuder Richardson #20 and #21 Cronbach’s Alpha Meaning: consistency across the parts of a measuring instrument (“parts” = individual items or subgroups of items). Appropriate Use: Where focus is on the degree to which same characteristic is being measured. A measure of test homogeneity.

Inter-rater Reliability Situation: Having a sample of test papers (essays) scored independently by two examiners Procedure: correlate the two sets of scores Kendall’s coefficient of concordance Cohen’s kappa Intraclass correlation Pearson product moment Meaning: measure of scorer (rater) reliability (consistency, agreement) which yields the coefficient of concordance. Appropriate Use: For ensuring consistency between raters

When is a reliability satisfactory? Depends on the type of instrument Depends on the purpose of the study Depends on who is affected by results

Factors Affecting Reliability Estimates Test length Range of scores Item similarity

Standard Error of Measurement All tests scores contain some error For any test, the higher the reliability estimate, the lower the error The standard error or measurement is the average standard deviation of the error variance over the number of people in the sample Can be used to estimate a range within which a true score would likely fall

Use of Standard Error of Measurement We never know the true score By knowing the s.e.m. and by understanding the normal curve, we can assess the likelihood of the true score being within certain limits. The higher the reliability the lower the standard error of measurement, hence more confidence we can place in the accuracy of a person’s test score.

Normal Curve Areas Under the Curve .3413 .3413 .1359 .1359 68% .0214 .0214 95% .0013 .0013 99% -3se -2se -1se +1se +2se +3se X=test score

Warnings about Reliability No such thing as “the” reliability; Different methods are assessing consistency from different perspectives Reliability coefficients apply to the data, NOT to the instrument Any reliability is only an estimate of consistency