Reliability and Validity of Research Instruments

Slides:

Advertisements

Similar presentations

Questionnaire Development

Advertisements

Measurement Concepts Operational Definition: is the definition of a variable in terms of the actual procedures used by the researcher to measure and/or.

Topics: Quality of Measurements

Survey Methodology Reliability and Validity EPID 626 Lecture 12.

1 COMM 301: Empirical Research in Communication Kwan M Lee Lect4_1.

© 2006 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Validity and Reliability Chapter Eight.

Psychometrics William P. Wattles, Ph.D. Francis Marion University.

Assessment Procedures for Counselors and Helping Professionals, 7e © 2010 Pearson Education, Inc. All rights reserved. Chapter 5 Reliability.

VALIDITY AND RELIABILITY

1Reliability Introduction to Communication Research School of Communication Studies James Madison University Dr. Michael Smilowitz.

Reliability - The extent to which a test or instrument gives consistent measurement - The strength of the relation between observed scores and true scores.

Part II Sigma Freud & Descriptive Statistics

Part II Sigma Freud & Descriptive Statistics

MGT-491 QUANTITATIVE ANALYSIS AND RESEARCH FOR MANAGEMENT

MEQ Analysis. Outline Validity Validity Reliability Reliability Difficulty Index Difficulty Index Power of Discrimination Power of Discrimination.

Measurement. Scales of Measurement Stanley S. Stevens’ Five Criteria for Four Scales Nominal Scales –1. numbers are assigned to objects according to rules.

RESEARCH METHODS Lecture 18

RELIABILITY & VALIDITY What is Reliability? What is Reliability?What is Reliability?What is Reliability? How Can We Measure Reliability? How Can We Measure.

Reliability Analysis. Overview of Reliability What is Reliability? Ways to Measure Reliability Interpreting Test-Retest and Parallel Forms Measuring and.

RELIABILITY & VALIDITY

MEASUREMENT. Measurement “If you can’t measure it, you can’t manage it.” Bob Donath, Consultant.

Concept of Measurement

Lecture 7 Psyc 300A. Measurement Operational definitions should accurately reflect underlying variables and constructs When scores are influenced by other.

Measurement: Reliability and Validity For a measure to be useful, it must be both reliable and valid Reliable = consistent in producing the same results.

FOUNDATIONS OF NURSING RESEARCH Sixth Edition CHAPTER Copyright ©2012 by Pearson Education, Inc. All rights reserved. Foundations of Nursing Research,

Research Methods in MIS

Classroom Assessment A Practical Guide for Educators by Craig A

Measurement Concepts & Interpretation. Scores on tests can be interpreted: By comparing a client to a peer in the norm group to determine how different.

Measurement and Data Quality

Reliability, Validity, & Scaling

Validity and Reliability of Research and the Instruments

Instrumentation.

Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.

MEASUREMENT CHARACTERISTICS Error & Confidence Reliability, Validity, & Usability.

Data Analysis. Quantitative data: Reliability & Validity Reliability: the degree of consistency with which it measures the attribute it is supposed to.

Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 14 Measurement and Data Quality.

Unanswered Questions in Typical Literature Review 1. Thoroughness – How thorough was the literature search? – Did it include a computer search and a hand.

LECTURE 06B BEGINS HERE THIS IS WHERE MATERIAL FOR EXAM 3 BEGINS.

Psychometrics William P. Wattles, Ph.D. Francis Marion University.

Reliability & Validity

Counseling Research: Quantitative, Qualitative, and Mixed Methods, 1e © 2010 Pearson Education, Inc. All rights reserved. Basic Statistical Concepts Sang.

Tests and Measurements Intersession 2006.

Chapter 8 Validity and Reliability. Validity How well can you defend the measure? –Face V –Content V –Criterion-related V –Construct V.

Validity Validity: A generic term used to define the degree to which the test measures what it claims to measure.

Research Methodology and Methods of Social Inquiry Nov 8, 2011 Assessing Measurement Reliability & Validity.

Copyright © 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Assessing Measurement Quality in Quantitative Studies.

Reliability: The degree to which a measurement can be successfully repeated.

Psychometrics. Goals of statistics Describe what is happening now –DESCRIPTIVE STATISTICS Determine what is probably happening or what might happen in.

©2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.

Measurement Experiment - effect of IV on DV. Independent Variable (2 or more levels) MANIPULATED a) situational - features in the environment b) task.

Chapter 6 - Standardized Measurement and Assessment

Reliability a measure is reliable if it gives the same information every time it is used. reliability is assessed by a number – typically a correlation.

Language Assessment Lecture 7 Validity & Reliability Instructor: Dr. Tung-hsien He

Dr. Jeffrey Oescher 27 January 2014 Technical Issues  Two technical issues  Validity  Reliability.

Copyright © 2014 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 11 Measurement and Data Quality.

RELIABILITY AND VALIDITY Dr. Rehab F. Gwada. Control of Measurement Reliabilityvalidity.

Measurement and Scaling Concepts

1 Measurement Error All systematic effects acting to bias recorded results: -- Unclear Questions -- Ambiguous Questions -- Unclear Instructions -- Socially-acceptable.

ESTABLISHING RELIABILITY AND VALIDITY OF RESEARCH TOOLS Prof. HCL Rawat Principal UCON,BFUHS Faridkot.

Survey Methodology Reliability and Validity

MGMT 588 Research Methods for Business Studies

Ch. 5 Measurement Concepts.

Lecture 5 Validity and Reliability

پرسشنامه کارگاه.

5. Reliability and Validity

PSY 614 Instructor: Emily Bullock, Ph.D.

Unit IX: Validity and Reliability in nursing research

RESEARCH METHODS Lecture 18

The first test of validity

Chapter 8 VALIDITY AND RELIABILITY

Presentation transcript:

Reliability and Validity of Research Instruments An overview Reliability and Validity of Research Instruments

Measurement error Error variance--the extent of variability in test scores that is attributable to error rather than a true measure of behavior. Observed Score=true score + error variance (actual score obtained) (stable score) (chance/random error) (systematic error)

Validity The accuracy of the measure in reflecting the concept it is supposed to measure.

Reliability Stability and consistency of the measuring instrument. A measure can be reliable without being valid, but it cannot be valid without being reliable.

Validity The extent to which, and how well, a measure measures a concept. face content construct concurrent predictive criterion-related

Face validity Just on its face the instrument appears to be a good measure of the concept. “intuitive, arrived at through inspection” e.g. Concept=pain level Measure=verbal rating scale “rate your pain from 1 to 10”. Face validity is sometimes considered a subtype of content validity. Question--is there any time when face validity is not desirable?

Content validity Content of the measure is justified by other evidence, e.g. the literature. Entire range or universe of the construct is measured. Usually evaluated and scored by experts in the content area. A CVI (content validity index) of .80 or more is desirable.

Construct validity Sensitivity of the instrument to pick up minor variations in the concept being measured. Can an instrument to measure anxiety pick up different levels of anxiety or just its presence or absence? Measure two groups known to differ on the construct. Ways of arriving at construct validity Hypothesis testing method Convergent and divergent Multitrait-multimatrix method Contrasted groups approach factor analysis approach

Concurrent validity Correspondence of one measure of a phenomenon with another of the same construct.(administered at the same time) Two tools are used to measure the same concept and then a correlational analysis is performed. The tool which is already demonstrated to be valid is the “gold standard” with which the other measure must correlate.

Predictive validity The ability of one measure to predict another future measure of the same concept. If IQ predicts SAT, and SAT predicts QPA, then shouldn’t IQ predict QPA (we could skip SATs for admission decisions) If scores on a parenthood readiness scale indicate levels of integrity, trust, intimacy and identity couldn’t this test be used to predict successful achievement of the devleopmental tasks of adulthood? The researcher is usually looking for a more efficient way to measure a concept.

Criterion related validity The ability of a measure to measure a criterion (usually set by the researcher). If the criterion set for professionalism is nursing is belonging to nursing organizations and reading nursing journals, then couldn’t we just count memberships and subscriptions to come up with a professionalism score. Can you think of a simple criterion to measure leadership? Concurrent and predictive validity are often listed as forms of criterion related validity.

Reliability Homogeneity, equivalence and stability of a measure over time and subjects. The instrument yields the same results over repeated measures and subjects. Expressed as a correlation coefficient (degree of agreement between times and subjects) 0 to +1. Reliability coefficient expresses the relationship between error variance, true variance and the observed score. The higher the reliability coefficient, the lower the error variance. Hence, the higher the coefficient the more reliable the tool! .70 or higher acceptable.

Stability The same results are obtained over repeated administration of the instrument. Test-restest reliability parallel, equivalent or alternate forms

Test-Retest reliability The administration of the same instrument to the same subjects two or more times (under similar conditions--not before and after treatment) Scores are correlated and expressed as a Pearson r. (usually .70 acceptable)

Parallel or alternate forms reliability Parallel or alternate forms of a test are administered to the same individuals and scores are correlated. This is desirable when the researcher believes that repeated administration will result in “test-wiseness” Sample: ”I am able to tell my partner how I feel” “My partner tries to understand my feelings”

Homogeneity Internal consistency (unidimensional) Item-total correlations split-half reliability Kuder-Richardson coefficient Cronbach’s alpha

Item to total correlations Each item on an instrument is correlated to total score--an item with low correlation may be deleted. Highest and lowest correlations are usually reported. Only important if you desire homogeneity of items.

Spit Half reliability Items are divided into two halves and then compared. Odd, even items, or 1-50 and 51-100 are two ways to split items. Only important when homogenity and internal consistency is desirable.

Kuder-Richardson coefficient (KR-20) Estimate of homogeneity when items have a dichotomous response, e.g. “yes/no” items. Should be computed for a test on an initial reliability testing, and computed for the actual sample. Based on the consistency of responses to all of the items of a single form of a test.

Cronbach’s alpha Likert scale or linear graphic response format. Compares the consistency of response of all items on the scale. May need to be computed for each sample.

Equivalence Consistency of agreement of observers using the same measure or among alternate forms of a tool. Parallel or alternate forms (described under stability) Interrater reliability

Intertater reliability Used with observational data. Concordance between two or more observers scores of the same event or phenomenon.

Critiquing Was reliability and validity data presented and is it adequate? Was the appropriate method used? Was the reliability recalculated for the sample? Are the limitations of the tool discussed?