© 2006 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Validity and Reliability Chapter Eight.

Slides:

Advertisements

Similar presentations

Questionnaire Development

Advertisements

Topics: Quality of Measurements

Reliability Definition: The stability or consistency of a test. Assumption: True score = obtained score +/- error Domain Sampling Model Item Domain Test.

© McGraw-Hill Higher Education. All rights reserved. Chapter 3 Reliability and Objectivity.

Chapter 5 Reliability Robert J. Drummond and Karyn Dayle Jones Assessment Procedures for Counselors and Helping Professionals, 6 th edition Copyright ©2006.

Psychometrics William P. Wattles, Ph.D. Francis Marion University.

Assessment Procedures for Counselors and Helping Professionals, 7e © 2010 Pearson Education, Inc. All rights reserved. Chapter 5 Reliability.

VALIDITY AND RELIABILITY

What is a Good Test Validity: Does test measure what it is supposed to measure? Reliability: Are the results consistent? Objectivity: Can two or more.

Reliability and Validity of Research Instruments

RESEARCH METHODS Lecture 18

Reliability Analysis. Overview of Reliability What is Reliability? Ways to Measure Reliability Interpreting Test-Retest and Parallel Forms Measuring and.

RELIABILITY & VALIDITY

Reliability and Validity

© 2005 The McGraw-Hill Companies, Inc., All Rights Reserved. Chapter 5 Making Systematic Observations.

FOUNDATIONS OF NURSING RESEARCH Sixth Edition CHAPTER Copyright ©2012 by Pearson Education, Inc. All rights reserved. Foundations of Nursing Research,

Research Methods in MIS

Validity and Reliability EAF 410 July 9, Validity b Degree to which evidence supports inferences made b Appropriate b Meaningful b Useful.

INTELLIGENCE AND PSYCHOLOGICAL TESTING. KEY CONCEPTS IN PSYCHOLOGICAL TESTING Psychological test: a standardized measure of a sample of a person’s behavior.

Classroom Assessment A Practical Guide for Educators by Craig A

McGraw-Hill © 2006 The McGraw-Hill Companies, Inc. All rights reserved. Correlational Research Chapter Fifteen.

Technical Issues Two concerns Validity Reliability

Measurement and Data Quality

Validity and Reliability

Foundations of Educational Measurement

Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.

MEASUREMENT CHARACTERISTICS Error & Confidence Reliability, Validity, & Usability.

Data Analysis. Quantitative data: Reliability & Validity Reliability: the degree of consistency with which it measures the attribute it is supposed to.

Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 14 Measurement and Data Quality.

McMillan Educational Research: Fundamentals for the Consumer, 6e © 2012 Pearson Education, Inc. All rights reserved. Educational Research: Fundamentals.

LECTURE 06B BEGINS HERE THIS IS WHERE MATERIAL FOR EXAM 3 BEGINS.

Psychometrics William P. Wattles, Ph.D. Francis Marion University.

Principles of Test Construction

Instrumentation (cont.) February 28 Note: Measurement Plan Due Next Week.

Reliability REVIEW Inferential Infer sample findings to entire population Chi Square (2 nominal variables) t-test (1 nominal variable for 2 groups, 1 continuous)

Validity. Face Validity  The extent to which items on a test appear to be meaningful and relevant to the construct being measured.

Reliability & Validity

Tests and Measurements Intersession 2006.

Assessing Learners with Special Needs: An Applied Approach, 6e © 2009 Pearson Education, Inc. All rights reserved. Chapter 4:Reliability and Validity.

EDU 8603 Day 6. What do the following numbers mean?

Advanced Research Methods Unit 3 Reliability and Validity.

Chapter 8 Validity and Reliability. Validity How well can you defend the measure? –Face V –Content V –Criterion-related V –Construct V.

Concurrent Validity Pages By: Davida R. Molina October 23, 2006.

Presented By Dr / Said Said Elshama  Distinguish between validity and reliability.  Describe different evidences of validity.  Describe methods of.

Research Methodology and Methods of Social Inquiry Nov 8, 2011 Assessing Measurement Reliability & Validity.

Chapter 4 Validity Robert J. Drummond and Karyn Dayle Jones Assessment Procedures for Counselors and Helping Professionals, 6 th edition Copyright ©2006.

Copyright © 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Assessing Measurement Quality in Quantitative Studies.

Validity and Item Analysis Chapter 4. Validity Concerns what the instrument measures and how well it does that task Not something an instrument has or.

Validity and Item Analysis Chapter 4.  Concerns what instrument measures and how well it does so  Not something instrument “has” or “does not have”

McGraw-Hill/Irwin © 2012 The McGraw-Hill Companies, Inc. All rights reserved. Obtaining Valid and Reliable Classroom Evidence Chapter 4:

Criteria for selection of a data collection instrument. 1.Practicality of the instrument: -Concerns its cost and appropriateness for the study population.

Measurement MANA 4328 Dr. Jeanne Michalski

©2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.

Reliability and Validity Themes in Psychology. Reliability Reliability of measurement instrument: the extent to which it gives consistent measurements.

Measurement Experiment - effect of IV on DV. Independent Variable (2 or more levels) MANIPULATED a) situational - features in the environment b) task.

Chapter 6 - Standardized Measurement and Assessment

Classroom Assessment Chapters 4 and 5 ELED 4050 Summer 2007.

Chapter 6 Norm-Referenced Reliability and Validity.

Measuring Research Variables

©2013, The McGraw-Hill Companies, Inc. All Rights Reserved Chapter 5 What is a Good Test?

Dr. Jeffrey Oescher 27 January 2014 Technical Issues  Two technical issues  Validity  Reliability.

Copyright © 2014 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 11 Measurement and Data Quality.

ESTABLISHING RELIABILITY AND VALIDITY OF RESEARCH TOOLS Prof. HCL Rawat Principal UCON,BFUHS Faridkot.

Lecture 5 Validity and Reliability

Reliability and Validity in Research

Reliability & Validity

PSY 614 Instructor: Emily Bullock, Ph.D.

The first test of validity

Chapter 8 VALIDITY AND RELIABILITY

Presentation transcript:

© 2006 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Validity and Reliability Chapter Eight

© 2006 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Validity and Reliability Chapter Eight

© 2006 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Validity Validity has been defined as referring to the appropriateness, correctness, meaningfulness, and usefulness of the specific inferences researchers make based on the data they collect. Validity has been defined as referring to the appropriateness, correctness, meaningfulness, and usefulness of the specific inferences researchers make based on the data they collect. It is the most important idea to consider when preparing or selecting an instrument. It is the most important idea to consider when preparing or selecting an instrument. Validation is the process of collecting and analyzing evidence to support such inferences. Validation is the process of collecting and analyzing evidence to support such inferences.

© 2006 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Evidence of Validity There are 3 types of evidence a researcher might collect There are 3 types of evidence a researcher might collect Content-related evidence of validity Content-related evidence of validity Content and format of the instrument Content and format of the instrument Criterion-related evidence of validity Criterion-related evidence of validity Relationship between scores obtained using the instrument and scores obtained Relationship between scores obtained using the instrument and scores obtained Construct-related evidence of validity Construct-related evidence of validity Psychological construct being measured by the instrument Psychological construct being measured by the instrument

© 2006 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Illustration of Types of Evidence of Validity (Figure 8.1)

© 2006 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Content-related Evidence A key element is the adequacy of the sampling of the domain it is supposed to represent. A key element is the adequacy of the sampling of the domain it is supposed to represent. The other aspect of content validation is the format of the instrument. The other aspect of content validation is the format of the instrument. Attempts to obtain evidence that the items measure what they are supposed to measure typify the process of content-related evidence. Attempts to obtain evidence that the items measure what they are supposed to measure typify the process of content-related evidence.

© 2006 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Criterion-related Evidence A criterion is a second test presumed to measure the same variable. A criterion is a second test presumed to measure the same variable. There are two forms of criterion-related validity: There are two forms of criterion-related validity: 1) Predictive validity: time interval elapses between administering the instrument and obtaining criterion scores 2) Concurrent validity: instrument data and criterion data are gathered and compared at the same time A Correlation Coefficient (r) indicates the degree of relationship that exists between the scores of individuals obtained by two instruments. A Correlation Coefficient (r) indicates the degree of relationship that exists between the scores of individuals obtained by two instruments.

© 2006 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Construct-related Evidence Considered the broadest of the three categories. Considered the broadest of the three categories. There is no single piece of evidence that satisfies construct-related validity. There is no single piece of evidence that satisfies construct-related validity. Researchers attempt to collect a variety of types of evidence, including both content-related and criterion-related evidence. Researchers attempt to collect a variety of types of evidence, including both content-related and criterion-related evidence. The more evidence researchers have from different sources, the more confident they become about the interpretation of the instrument. The more evidence researchers have from different sources, the more confident they become about the interpretation of the instrument.

© 2006 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Reliability Refers to the consistency of scores or answers provided by an instrument. Refers to the consistency of scores or answers provided by an instrument. Scores obtained can be considered reliable but not valid. Scores obtained can be considered reliable but not valid. An instrument should be reliable and valid (Figure 8.2), depending on the context in which an instrument is used. An instrument should be reliable and valid (Figure 8.2), depending on the context in which an instrument is used.

© 2006 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Reliability and Validity (Figure 8.2)

© 2006 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Reliability of Measurement (Figure 8.3)

© 2006 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Errors of Measurement Because errors of measurement are always present to some degree, variation in test scores are common. Because errors of measurement are always present to some degree, variation in test scores are common. This is due to: This is due to: Differences in motivation Differences in motivation Energy Energy Anxiety Anxiety Different testing situation Different testing situation

© 2006 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Reliability Coefficient Expresses a relationship between scores of the same instrument at two different times or parts of the instrument. Expresses a relationship between scores of the same instrument at two different times or parts of the instrument. The 3 best known methods are: The 3 best known methods are: Test-retest Test-retest Equivalent forms method Equivalent forms method Internal consistency method Internal consistency method

© 2006 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Test-Retest Method Involves administering the same test twice to the same group after a certain time interval has elapsed. Involves administering the same test twice to the same group after a certain time interval has elapsed. A reliability coefficient is calculated to indicate the relationship between the two sets of scores. A reliability coefficient is calculated to indicate the relationship between the two sets of scores. Reliability coefficients are affected by the lapse of time between the administrations of the test. Reliability coefficients are affected by the lapse of time between the administrations of the test. An appropriate time interval should be selected. An appropriate time interval should be selected. In Educational Research, scores collected over a two-month period is considered sufficient evidence of test-retest reliability. In Educational Research, scores collected over a two-month period is considered sufficient evidence of test-retest reliability.

© 2006 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Equivalent-Forms Method Two different but equivalent (alternate or parallel) forms of an instrument are administered to the same group during the same time period. Two different but equivalent (alternate or parallel) forms of an instrument are administered to the same group during the same time period. A reliability coefficient is then calculated between the two sets of scores. A reliability coefficient is then calculated between the two sets of scores. It is possible to combine the test-retest and equivalent-forms methods by giving two different forms of testing with a time interval between the two administrations. It is possible to combine the test-retest and equivalent-forms methods by giving two different forms of testing with a time interval between the two administrations.

© 2006 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Internal-Consistency Methods There are several internal-consistency methods that require only one administration of an instrument. There are several internal-consistency methods that require only one administration of an instrument. Split-half Procedure: involves scoring two halves of a test separately for each subject and calculating the correlation coefficient between the two scores. Split-half Procedure: involves scoring two halves of a test separately for each subject and calculating the correlation coefficient between the two scores. Kuder-Richardson Approaches: (KR20 and KR21) requires 3 pieces of information: Kuder-Richardson Approaches: (KR20 and KR21) requires 3 pieces of information: Number of items on the test Number of items on the test The mean The mean The standard deviation The standard deviation Considered the most frequent method for determining internal consistency Considered the most frequent method for determining internal consistency Alpha Coefficient: a general form of the KR20 used to calculate the reliability of items that are not scored right vs. wrong. Alpha Coefficient: a general form of the KR20 used to calculate the reliability of items that are not scored right vs. wrong.

© 2006 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Standard Error of Measurement An index that shows the extent to which a measurement would vary under changed circumstances. An index that shows the extent to which a measurement would vary under changed circumstances. There are many possible standard errors for scores given. There are many possible standard errors for scores given. Also known as measurement error, a range of scores that show the amount of error which can be expected. (Appendix D) Also known as measurement error, a range of scores that show the amount of error which can be expected. (Appendix D)

© 2006 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Scoring Agreement Scoring agreement requires a demonstration that independent scorers can achieve satisfactory agreement in their scoring. Scoring agreement requires a demonstration that independent scorers can achieve satisfactory agreement in their scoring. Instruments that use direct observations are highly vulnerable to observer differences. Instruments that use direct observations are highly vulnerable to observer differences. What is desired is a correlation of at least.90 among scorers as an acceptable level of agreement. What is desired is a correlation of at least.90 among scorers as an acceptable level of agreement.