Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Measurement Error All systematic effects acting to bias recorded results: -- Unclear Questions -- Ambiguous Questions -- Unclear Instructions -- Socially-acceptable.

Similar presentations


Presentation on theme: "1 Measurement Error All systematic effects acting to bias recorded results: -- Unclear Questions -- Ambiguous Questions -- Unclear Instructions -- Socially-acceptable."— Presentation transcript:

1 1 Measurement Error All systematic effects acting to bias recorded results: -- Unclear Questions -- Ambiguous Questions -- Unclear Instructions -- Socially-acceptable Answers -- Respondents do not know answers -- Deliberate lies

2 2 VALIDITY  ADDRESSES SYSTEMATIC ERROR IN MEASUREMENT  A VALID INSTRUMENT IS ONE THAT MEASURES WHAT IT PURPORTS TO MEASURE, ALL IT PURPORTS TO MEASURE AND ONLY WHAT IT PURPORTS TO MEASURE

3 3 FACE VALIDITY  “ITEMS LOOK AS THOUGH THEY MEASURE WHAT IS IMPORTANT”  LOOK VALID TO THE DEVELOPER  OR  LOOK VALID TO THOSE WHO WILL COMPLETE THE INSTRUMENT  APPEAL AND APPEARANCE OF THE INSTRUMENT  A “FIELD TEST” IS USED TO VERIFY

4 4 CONTENT VALIDITY  DOES THE TEST GIVE A FAIR MEASURE ON SOME IMPORTANT SET OF TASKS? DOES IT REPRESENT THE CONTENT OF THE DOMAIN?  PROCEDURE: PANEL OF EXPERTS USED TO COMPARE THE ITEMS LOGICALLY TO THE DOMAIN TO BE MEARSURED TO PRODUCE A “JURIED” INSTRUMENT  NOT EXPRESSED AS A NUMBER

5 5 PREDICTIVE VALIDITY (CRITERION RELATED VALIDITY)  DO TEST SCORES PREDICT A CERTAIN FUTURE PERFORMANCE?  GIVE TEST AND USE RESULTS TO PREDICT THE OUTCOME SOME TIME LATER  CORRELATION OFTEN USED  COULD ALSO BE USED FOR “KNOWN SOURCE”: WILL THIS LEADERSHIP MEASURE DIFFERENTIATE BETWEEN PEOPLE WHO WILL BECOME GOOD LEADERS AND THOSE WHO WILL NOT?

6 6 CONCURRENT VALIDITY (CRITERION RELATED VALIDITY)  COMPARE RESULTS WITH SOME CURRENT PERFORMANCE  CORRELATION OFTEN USED  “KNOWN SOURCE” COULD ALSO APPLY  OFTEN USED TO MAKE A TEST TO SUBSTITUTE FOR A LESS CONVENIENT PROCEDURE

7 7 CONSTRUCT VALIDITY  WANT TO KNOW WHAT PSYCHOLOGICAL OR OTHER PROPERTY OR PROPERTIES CAN “EXPLAIN” THE VARIANCE OF THE TEST.  “HOW CAN SCORES ON THIS TEST BE EXPLAINED PSYCHOLOGICALLY?”  “WHAT PSYCHOLOGICAL CONSTRUCTS UNDERLIE THIS TEST?”  PROCEDURE: FACTOR ANALYSIS OR HYPOTHESIS TESTING

8 8 RELIABILITY  DOES AN INSTRUMENT CONSISTENTLY MEASURE WHATEVER IT IS MEASURING?  SYNONYMS: –DEPENDABILTY –STABILITY –CONSISTENCY –PREDICTABILITY –ACCURACY RELIABILITY IS DEFINED THROUGH ERROR: THE MORE ERROR, THE GREATER THE UNRELIABILITY; THE LESS ERROR, THE GREATER THE RELIABILITY

9 9 TEST-RETEST(2 Adm)  A “COEFFICIENT OF STABILITY” IS PRODUCED  TEST ADMINISTERED TO SAME GROUP ON TWO OCCASIONS, CORRELATED THE TWO OR CALCULATE % AGREEMENT ON ITEMS  COEFFICIENT CAN CHANGE WITH TIME; THUS, ALWAYS REPORT TIME:  r (one-week) =.77  LONGER TIME BETWEEN TESTS, LOWER THE COEFFICIENT

10 10 ONE ADMINISTRATION  1. COEFFICIENT OF EQUIVALENCY, OR  2. COEFFICIENT OF INTERNAL CONSISTENCY  1. EQUIVALENCY: TELLS HOW WELL A TEST AGREES WITH ANOTHER EQUIVALENT MEASURE MADE AT THE SAME TIME: PARALLEL FORM PROCEDURE  EXAMPLE: NEED TWO SIMILAR TESTS FOR A PRETEST – POSTTEST STUDY -- CALLED PARALLEL FORM OR EQUIVALENT FORM TESTS  PROCEDURE: ADMINISTER TEST, SCORE WITH ITEM ANALYSIS, MAKE TWO EQUALLY DIFFICULT AND DISCRIMINATING TESTS

11 11 INTERNAL CONSISTENCY  A. SPLIT-HALF METHOD  B. ALPHA OR K-R METHODS  (ONE ADMINISTRATION)  (1) GIVE TEST  (2) CALCULATE RELIABILITY  FOR A: SPLIT TEST IN HALF: ODD EVEN, OR FIRST WITH SECOND HALF (r). SINCE r IS A FUNCTION OF LENGTH, CORRECT FOR SHORT TESTS: SPEARMAN-BROWN CORRECTION

12 12 ALPHA OR K-R  K-R 20, 21 & CRONBACH’S ALPHA  SPLIT TESTS IN ALL POSSIBLE WAYS AND INTERCORRELATE  CANNOT USE WITH SPEED TESTS  USE K-R 20 FOR RIGHT/WRONG  ALPHA & 21 FOR MULTIPLE RESPONSE CATEGORIES; e.g., Likert-type Scales

13 13 FACTORS INFLUENCING RELIABILITY 1. > ITEMS = > RELIABILITY 2. > TIME = > RELIABILITY 3. R 4. > OBJECTIVE SCORING = > R 5. > PROBABILITY OF SUCCESS BY CHANCE = < RELIABILITY 6. > INACCURACY IN SCORING = < R 7. > HOMOGENOUS MATERIAL = > R 8. > COMMON EXPERIENCE OF STUDENTS = > R 9. > TRICK QUESTIONS = < R 10. > MISINTERPREATION OF ITEMS = < R

14 14 IMPROVING r  WRITE UNAMBIGUOUS ITEMS  ADD MORE ITEMS OF EQUAL KIND AND DIFFICULTY ( See page 29).  USE CLEAR AND STANDARD INSTRUCTIONS  r DEPENDS ON SPREAD OF SCORES; THUS, LOW r COULD BE BECAUSE EVERYONE SCORES ABOUT THE SAME  TEST CAN BE RELIABLE FOR ONE LEVEL OF ABILITY AND NOT ANOTHER  VALIDITY COEFFICIENT CANNOT EXCEED THE SQUARE ROOT OF r

15 15 ACCEPTABLE r?  NUNNALLY: … DEPENDS ON USE. “IN EARLY STAGES OF RESEARCH ON HYPOTHESIZED MEASURES OF A CONSTRUCT, ONE SAVES TIME AND MONEY BY WORKING WITH INSTRUMENTS THAT HAVE ONLY MODEST r; AN r OF.50 TO.60 WILL SUFFICE.”

16 16 SUITABILITY  REALLY A PART OF VALIDITY  “IS INSTRUMENT SUITABLE FOR THE AUDIENCE?” -- FIELD TEST  READABILITY: FOG INDEX (Many others available)  1. RANDOM SAMPLE OF 100 WORDS, COUNT NUMBER OF SENTENCES. DIVIDE #WORDS/# SENTENCES = AVERAGE SENTENCE LENGTH (ASL)  2. COUNT WORDS, IN THOSE 100, WITH 3 OR MORE SYLLABLES; OMITTING COMBINATIONS OF EASY WORDS (BUTTER-FLY) AND WORDS MADE 3 BY ADDING “ED” OR “ES” (CREATED); OMITTING CAPITALIZED WORDS = % HARD WORDS (%HW)  3. # YEARS EDUC = (ASL + % HW) (.4)  [MOST PEOPLE PREFER TO READ 2 GRADE LEVELS BELOW THEIR LEVEL OF EDUCATION].


Download ppt "1 Measurement Error All systematic effects acting to bias recorded results: -- Unclear Questions -- Ambiguous Questions -- Unclear Instructions -- Socially-acceptable."

Similar presentations


Ads by Google