1 Measurement Error All systematic effects acting to bias recorded results: -- Unclear Questions -- Ambiguous Questions -- Unclear Instructions -- Socially-acceptable.

Slides:



Advertisements
Similar presentations
Questionnaire Development
Advertisements

Consistency in testing
Reliability.
The Research Consumer Evaluates Measurement Reliability and Validity
Taking Stock Of Measurement. Basics Of Measurement Measurement: Assignment of number to objects or events according to specific rules. Conceptual variables:
1 COMM 301: Empirical Research in Communication Kwan M Lee Lect4_1.
Reliability Definition: The stability or consistency of a test. Assumption: True score = obtained score +/- error Domain Sampling Model Item Domain Test.
© McGraw-Hill Higher Education. All rights reserved. Chapter 3 Reliability and Objectivity.
© 2006 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Validity and Reliability Chapter Eight.
Psychometrics William P. Wattles, Ph.D. Francis Marion University.
1 Reliability in Scales Reliability is a question of consistency do we get the same numbers on repeated measurements? Low reliability: reaction time High.
Assessment Procedures for Counselors and Helping Professionals, 7e © 2010 Pearson Education, Inc. All rights reserved. Chapter 5 Reliability.
Reliability - The extent to which a test or instrument gives consistent measurement - The strength of the relation between observed scores and true scores.
 A description of the ways a research will observe and measure a variable, so called because it specifies the operations that will be taken into account.
Part II Sigma Freud & Descriptive Statistics
What is a Good Test Validity: Does test measure what it is supposed to measure? Reliability: Are the results consistent? Objectivity: Can two or more.
Part II Sigma Freud & Descriptive Statistics
MEQ Analysis. Outline Validity Validity Reliability Reliability Difficulty Index Difficulty Index Power of Discrimination Power of Discrimination.
Reliability and Validity of Research Instruments
RESEARCH METHODS Lecture 18
Reliability and Validity Dr. Roy Cole Department of Geography and Planning GVSU.
Measurement: Reliability and Validity For a measure to be useful, it must be both reliable and valid Reliable = consistent in producing the same results.
Validity, Reliability, & Sampling
Research Methods in MIS
Reliability of Selection Measures. Reliability Defined The degree of dependability, consistency, or stability of scores on measures used in selection.
Validity and Reliability Neither Valid nor Reliable Reliable but not Valid Valid & Reliable Fairly Valid but not very Reliable Think in terms of ‘the purpose.
Technical Issues Two concerns Validity Reliability
Measurement and Data Quality
Reliability, Validity, & Scaling
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
MEASUREMENT CHARACTERISTICS Error & Confidence Reliability, Validity, & Usability.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 14 Measurement and Data Quality.
SELECTION OF MEASUREMENT INSTRUMENTS Ê Administer a standardized instrument Ë Administer a self developed instrument Ì Record naturally available data.
Technical Adequacy Session One Part Three.
Psychometrics William P. Wattles, Ph.D. Francis Marion University.
Reliability & Validity
Tests and Measurements Intersession 2006.
Chapter 8 Validity and Reliability. Validity How well can you defend the measure? –Face V –Content V –Criterion-related V –Construct V.
Validity and Reliability Neither Valid nor Reliable Reliable but not Valid Valid & Reliable Fairly Valid but not very Reliable Think in terms of ‘the purpose.
Validity Validity: A generic term used to define the degree to which the test measures what it claims to measure.
RELIABILITY AND VALIDITY OF ASSESSMENT
Presented By Dr / Said Said Elshama  Distinguish between validity and reliability.  Describe different evidences of validity.  Describe methods of.
Copyright © 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Assessing Measurement Quality in Quantitative Studies.
1 LANGUAE TEST RELIABILITY. 2 What Is Reliability? Refer to a quality of test scores, and has to do with the consistency of measures across different.
©2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Reliability performance on language tests is also affected by factors other than communicative language ability. (1) test method facets They are systematic.
Chapter 6 - Standardized Measurement and Assessment
Reliability a measure is reliable if it gives the same information every time it is used. reliability is assessed by a number – typically a correlation.
Language Assessment Lecture 7 Validity & Reliability Instructor: Dr. Tung-hsien He
Measuring Research Variables
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved Chapter 5 What is a Good Test?
Copyright © 2014 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 11 Measurement and Data Quality.
5. Evaluation of measuring tools: reliability Psychometrics. 2011/12. Group A (English)
Measurement and Scaling Concepts
ESTABLISHING RELIABILITY AND VALIDITY OF RESEARCH TOOLS Prof. HCL Rawat Principal UCON,BFUHS Faridkot.
Professor Jim Tognolini
Chapter 2 Theoretical statement:
Lecture 5 Validity and Reliability
Questions What are the sources of error in measurement?
Concept of Test Validity
CHAPTER 5 MEASUREMENT CONCEPTS © 2007 The McGraw-Hill Companies, Inc.
Classical Test Theory Margaret Wu.
پرسشنامه کارگاه.
Calculating Reliability of Quantitative Measures
Reliability and Validity of Measurement
Unit IX: Validity and Reliability in nursing research
Evaluation of measuring tools: reliability
RESEARCH METHODS Lecture 18
Chapter 4 Characteristics of a Good Test
Chapter 8 VALIDITY AND RELIABILITY
Presentation transcript:

1 Measurement Error All systematic effects acting to bias recorded results: -- Unclear Questions -- Ambiguous Questions -- Unclear Instructions -- Socially-acceptable Answers -- Respondents do not know answers -- Deliberate lies

2 VALIDITY  ADDRESSES SYSTEMATIC ERROR IN MEASUREMENT  A VALID INSTRUMENT IS ONE THAT MEASURES WHAT IT PURPORTS TO MEASURE, ALL IT PURPORTS TO MEASURE AND ONLY WHAT IT PURPORTS TO MEASURE

3 FACE VALIDITY  “ITEMS LOOK AS THOUGH THEY MEASURE WHAT IS IMPORTANT”  LOOK VALID TO THE DEVELOPER  OR  LOOK VALID TO THOSE WHO WILL COMPLETE THE INSTRUMENT  APPEAL AND APPEARANCE OF THE INSTRUMENT  A “FIELD TEST” IS USED TO VERIFY

4 CONTENT VALIDITY  DOES THE TEST GIVE A FAIR MEASURE ON SOME IMPORTANT SET OF TASKS? DOES IT REPRESENT THE CONTENT OF THE DOMAIN?  PROCEDURE: PANEL OF EXPERTS USED TO COMPARE THE ITEMS LOGICALLY TO THE DOMAIN TO BE MEARSURED TO PRODUCE A “JURIED” INSTRUMENT  NOT EXPRESSED AS A NUMBER

5 PREDICTIVE VALIDITY (CRITERION RELATED VALIDITY)  DO TEST SCORES PREDICT A CERTAIN FUTURE PERFORMANCE?  GIVE TEST AND USE RESULTS TO PREDICT THE OUTCOME SOME TIME LATER  CORRELATION OFTEN USED  COULD ALSO BE USED FOR “KNOWN SOURCE”: WILL THIS LEADERSHIP MEASURE DIFFERENTIATE BETWEEN PEOPLE WHO WILL BECOME GOOD LEADERS AND THOSE WHO WILL NOT?

6 CONCURRENT VALIDITY (CRITERION RELATED VALIDITY)  COMPARE RESULTS WITH SOME CURRENT PERFORMANCE  CORRELATION OFTEN USED  “KNOWN SOURCE” COULD ALSO APPLY  OFTEN USED TO MAKE A TEST TO SUBSTITUTE FOR A LESS CONVENIENT PROCEDURE

7 CONSTRUCT VALIDITY  WANT TO KNOW WHAT PSYCHOLOGICAL OR OTHER PROPERTY OR PROPERTIES CAN “EXPLAIN” THE VARIANCE OF THE TEST.  “HOW CAN SCORES ON THIS TEST BE EXPLAINED PSYCHOLOGICALLY?”  “WHAT PSYCHOLOGICAL CONSTRUCTS UNDERLIE THIS TEST?”  PROCEDURE: FACTOR ANALYSIS OR HYPOTHESIS TESTING

8 RELIABILITY  DOES AN INSTRUMENT CONSISTENTLY MEASURE WHATEVER IT IS MEASURING?  SYNONYMS: –DEPENDABILTY –STABILITY –CONSISTENCY –PREDICTABILITY –ACCURACY RELIABILITY IS DEFINED THROUGH ERROR: THE MORE ERROR, THE GREATER THE UNRELIABILITY; THE LESS ERROR, THE GREATER THE RELIABILITY

9 TEST-RETEST(2 Adm)  A “COEFFICIENT OF STABILITY” IS PRODUCED  TEST ADMINISTERED TO SAME GROUP ON TWO OCCASIONS, CORRELATED THE TWO OR CALCULATE % AGREEMENT ON ITEMS  COEFFICIENT CAN CHANGE WITH TIME; THUS, ALWAYS REPORT TIME:  r (one-week) =.77  LONGER TIME BETWEEN TESTS, LOWER THE COEFFICIENT

10 ONE ADMINISTRATION  1. COEFFICIENT OF EQUIVALENCY, OR  2. COEFFICIENT OF INTERNAL CONSISTENCY  1. EQUIVALENCY: TELLS HOW WELL A TEST AGREES WITH ANOTHER EQUIVALENT MEASURE MADE AT THE SAME TIME: PARALLEL FORM PROCEDURE  EXAMPLE: NEED TWO SIMILAR TESTS FOR A PRETEST – POSTTEST STUDY -- CALLED PARALLEL FORM OR EQUIVALENT FORM TESTS  PROCEDURE: ADMINISTER TEST, SCORE WITH ITEM ANALYSIS, MAKE TWO EQUALLY DIFFICULT AND DISCRIMINATING TESTS

11 INTERNAL CONSISTENCY  A. SPLIT-HALF METHOD  B. ALPHA OR K-R METHODS  (ONE ADMINISTRATION)  (1) GIVE TEST  (2) CALCULATE RELIABILITY  FOR A: SPLIT TEST IN HALF: ODD EVEN, OR FIRST WITH SECOND HALF (r). SINCE r IS A FUNCTION OF LENGTH, CORRECT FOR SHORT TESTS: SPEARMAN-BROWN CORRECTION

12 ALPHA OR K-R  K-R 20, 21 & CRONBACH’S ALPHA  SPLIT TESTS IN ALL POSSIBLE WAYS AND INTERCORRELATE  CANNOT USE WITH SPEED TESTS  USE K-R 20 FOR RIGHT/WRONG  ALPHA & 21 FOR MULTIPLE RESPONSE CATEGORIES; e.g., Likert-type Scales

13 FACTORS INFLUENCING RELIABILITY 1. > ITEMS = > RELIABILITY 2. > TIME = > RELIABILITY 3. R 4. > OBJECTIVE SCORING = > R 5. > PROBABILITY OF SUCCESS BY CHANCE = < RELIABILITY 6. > INACCURACY IN SCORING = < R 7. > HOMOGENOUS MATERIAL = > R 8. > COMMON EXPERIENCE OF STUDENTS = > R 9. > TRICK QUESTIONS = < R 10. > MISINTERPREATION OF ITEMS = < R

14 IMPROVING r  WRITE UNAMBIGUOUS ITEMS  ADD MORE ITEMS OF EQUAL KIND AND DIFFICULTY ( See page 29).  USE CLEAR AND STANDARD INSTRUCTIONS  r DEPENDS ON SPREAD OF SCORES; THUS, LOW r COULD BE BECAUSE EVERYONE SCORES ABOUT THE SAME  TEST CAN BE RELIABLE FOR ONE LEVEL OF ABILITY AND NOT ANOTHER  VALIDITY COEFFICIENT CANNOT EXCEED THE SQUARE ROOT OF r

15 ACCEPTABLE r?  NUNNALLY: … DEPENDS ON USE. “IN EARLY STAGES OF RESEARCH ON HYPOTHESIZED MEASURES OF A CONSTRUCT, ONE SAVES TIME AND MONEY BY WORKING WITH INSTRUMENTS THAT HAVE ONLY MODEST r; AN r OF.50 TO.60 WILL SUFFICE.”

16 SUITABILITY  REALLY A PART OF VALIDITY  “IS INSTRUMENT SUITABLE FOR THE AUDIENCE?” -- FIELD TEST  READABILITY: FOG INDEX (Many others available)  1. RANDOM SAMPLE OF 100 WORDS, COUNT NUMBER OF SENTENCES. DIVIDE #WORDS/# SENTENCES = AVERAGE SENTENCE LENGTH (ASL)  2. COUNT WORDS, IN THOSE 100, WITH 3 OR MORE SYLLABLES; OMITTING COMBINATIONS OF EASY WORDS (BUTTER-FLY) AND WORDS MADE 3 BY ADDING “ED” OR “ES” (CREATED); OMITTING CAPITALIZED WORDS = % HARD WORDS (%HW)  3. # YEARS EDUC = (ASL + % HW) (.4)  [MOST PEOPLE PREFER TO READ 2 GRADE LEVELS BELOW THEIR LEVEL OF EDUCATION].