Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Locating and Assessing the Usefulness of Health Measures for Health Disparities Research Anita L. Stewart, Ph.D. University of California, San Francisco.

Similar presentations


Presentation on theme: "1 Locating and Assessing the Usefulness of Health Measures for Health Disparities Research Anita L. Stewart, Ph.D. University of California, San Francisco."— Presentation transcript:

1 1 Locating and Assessing the Usefulness of Health Measures for Health Disparities Research Anita L. Stewart, Ph.D. University of California, San Francisco Clinical Research with Diverse Communities EPI 222, Spring April 26, 2005

2 2 Outline u Locating measures u Basic psychometric properties u Rationale for multi-item measures u Additional measurement considerations in health disparities research u Steps in selecting measures for your study

3 3 Outline u Locating measures u Basic psychometric properties u Rationale for multi-item measures u Additional measurement considerations in health disparities research u Steps in selecting measures for your study

4 4 Need Measures with Good Psychometric Properties u Measure assesses concept of interest u Low levels of missing data u Good variability u Evidence of reliability u Evidence of validity u Responsive to change (for interventions)

5 5 Inappropriate Measures can Result in: u Conceptual inadequacy –Measuring wrong concept for your study u Poor data quality (e.g. missing data) u Poor variability u Poor reliability and validity u Inability to detect true associations among variables –e.g., no measured change in outcome when change occurred

6 6 Good Variability u All (or nearly all) scale levels are represented u Distribution approximates bell-shaped normal

7 7 Indicators of Variability u Range of scores (possible, observed) u Mean, median, mode u Standard deviation (standard error) u Skewness, kurtosis u % at floor (lowest score) u % at ceiling (highest score) u Inter-quartile range

8 8 Reliability u Extent to which an observed score is free of random error u Population-specific; reliability increases with: –sample size –variability in scores (dispersion) –a person’s level on the scale

9 9 Reliability Coefficient u Typically ranges from.00 - 1.00 u Higher scores indicate better reliability u Types of reliability tests –Internal-consistency –Test-retest –Inter-rater –Intra-rater

10 10 Internal Consistency Reliability: Cronbach’s Alpha u Requires multiple items supposedly measuring same construct to calculate u Extent to which all items measure the same construct (same latent variable) u Internal consistency reliability is a function of: –Number of items –Average correlation among items –Variability of items in your sample

11 11 Minimum Standards for Internal Consistency Reliability u For group comparisons (e.g., regression, correlational analyses) –.70 or above is minimum (Nunnally, 1978) –.80 is optimal – above.90 is unnecessary u For individual assessment (e.g., treatment decisions) –.90 or above (.95) is preferred (Nunnally, 1978)

12 12 Reliable Scale? u NO! u There is no such thing as a “reliable” scale u We only have accumulated “evidence” of reliability in a variety of populations in which it has been tested

13 13 Validity u Does a measure (or instrument) measure what it is supposed to measure? u And… Does a measure NOT measure what it is NOT supposed to measure?

14 14 Validation of Measures is an Iterative, Lengthy Process u Validity is not a property of the measure –validity is a property of a measure for particular purpose and sample –validation studies for one purpose and sample may not serve another purpose or sample u Accumulation of evidence: –Different samples –Longitudinal designs

15 15 Three Major Forms of Measurement Validity u Content u Criterion u Construct

16 16 Construct Validity Basics A process of answering the following questions: u What is the hypothesis? u What are the results? u Do the results support (confirm) the hypothesis?

17 17 Construct Validity: NOTE u Sometimes the hypothesis is that the measure will NOT be correlated with certain other measures, or will be less correlated with some than with others u THUS, observing a low or non-significant correlation can confirm construct validity

18 18 Outline u Locating measures u Basic psychometric properties u Rationale for multi-item measures u Additional measurement considerations in health disparities research u Steps in selecting measures for your study

19 19 Single- and Multi-Item Measures u A single-item measure consists of only one item u Response choices are interpretable u Example: How would you rate your health? 1 - Excellent 2 - Very good 3 - Good 4 - Fair 5 - Poor

20 20 Multi-Item Measures or Scales u Multi-item measures are created by combining two or more items into an overall measure or scale score u Summated score, scale score –A score in which multiple items are “summed” or combined

21 21 Example of a 2-item Measure or Scale How much of the time.... tired? 1 - All of the time 2 - Most of the time 3 - Some of the time 4 - A little of the time 5 - None of the time How much of the time …. full of energy? 1 - All of the time 2 - Most of the time 3 - Some of the time 4 - A little of the time 5 - None of the time

22 22 Step 1: Reverse One Item So They Are All in the Same Direction How much of the time.... tired? 1 - All of the time 2 - Most of the time 3 - Some of the time 4 - A little of the time 5 - None of the time How much of the time …. full of energy? 1=5 All of the time 2=4 Most of the time 3=3 Some of the time 4=2 A little of the time 5=1 None of the time Reverse “energy” item so high score = more energy

23 23 Step 1: Reverse One Item So They Are All in the Same Direction How much of the time.... tired? 1 - All of the time 2 - Most of the time 3 - Some of the time 4 - A little of the time 5 - None of the time How much of the time …. full of energy? 1=5 All of the time 2=4 Most of the time 3=3 Some of the time 4=2 A little of the time 5=1 None of the time Reverse “energy” item so high score = more energy

24 24 Step 2: Sum the Two Items How much of the time.... tired? 1 - All of the time 2 - Most of the time 3 - Some of the time 4 - A little of the time 5 - None of the time How much of the time …. full of energy? 1=5 All of the time 2=4 Most of the time 3=3 Some of the time 4=2 A little of the time 5=1 None of the time Highest score= 10 (tired none of the time, full of energy all of the time)

25 25 Step 2: Sum the Two Items How much of the time.... tired? 1 - All of the time 2 - Most of the time 3 - Some of the time 4 - A little of the time 5 - None of the time How much of the time …. full of energy? 1=5 All of the time 2=4 Most of the time 3=3 Some of the time 4=2 A little of the time 5=1 None of the time Lowest score= 2 (tired all of the time, full of energy none of the time)

26 26 Advantages of Multi-item measures u More scale values (enhances sensitivity) –Moved from 2 items with 1-5 levels to 1 scale with 9 levels (2 – 10) u Improves score distribution (more normal) u Reduces number of variables needed to measure one concept u Improves reliability (reduces random error) u Can estimate a score if some items missings

27 27 Outline u Locating measures u Basic psychometric properties u Rationale for multi-item measures u Additional measurement considerations in health disparities research u Steps in selecting measures for your study

28 28 Additional Measurement Issues: Health Disparities Research u Measurement adequacy and equivalence in diverse groups

29 29 Group Comparisons Are Even More Problematic u Health disparities studies involve comparing mean levels of health u Requires conceptual equivalence u Also, i f psychometric properties are not comparable across groups… –potential true differences may be obscured –observed group differences may be inaccurate

30 30 Why Not Use Culture-Specific Measures? u Measurement goal is to identify measures that can be used across all groups, yet maintain sensitivity to diversity and have minimal bias u Most health disparities studies require comparing mean scores across diverse groups –need comparable measures

31 31 Issues Concerning Group Comparisons u Disparities in observed scores can be due to –culturally- or group-mediated differences in true score (true differences) -- OR -- –bias - systematic differences between group observed scores not attributable to true scores

32 32 Bias - A Special Concern u Measurement bias may make group comparisons invalid u Bias can be due to group differences in: –the meaning of concepts or items –the extent to which measures represent concepts –cognitive processes of responding –appropriateness of methods

33 33 Psychometric Adequacy in One Group Conceptual Psychometric Adequacy in 1 Group Equivalence Across Groups Concept equivalent across groups Psychometric properties meet minimal standards within one group Psychometric properties invariant (equivalent) across groups Concept meaningful within one group

34 34 Psychometric Adequacy in a Diverse Group u Psychometric properties meet minimal standards –Adequate reliability/reproducibility –Confirmation of theoretically-based factor structure –Construct validity evidence –Responsiveness to change evidence

35 35 Psychometric Adequacy in a Diverse Group (cont.) u Measures have similar measurement properties in a diverse group as in original mainstream groups on which the measures were developed, i.e., similar –reliability –factor structure –construct validity –responsiveness to change

36 36 Psychometric Equivalence Conceptual Psychometric Adequacy in 1 Group Equivalence Across Groups Concept equivalent across groups Psychometric properties meet minimal standards within one group Psychometric properties invariant (equivalent) across groups Concept meaningful within one group

37 37 Equivalence of Factor Structure: Psychometric Invariance u Psychometric invariance (equivalence) u Important properties of theoretically-based factor structure of measurement model do not vary across groups

38 38 Methods for Assessing Equivalence of Factor Structure u Exploratory factor analysis –Two or more groups –Subjective comparison of factor structure u Confirmatory factor analysis –Two or more groups –Test for equivalence of factor structure »test fit of theoretical model to data

39 39 Outline u Locating measures u Basic psychometric properties u Rationale for multi-item measures u Additional measurement considerations in health disparities research u Steps in selecting measures for your study

40 40 The Problem u You are beginning a study u You know the concepts (variables) of interest u Question: Which measure of ________ should I use? »A popular measure »One that a colleague used successfully »Create your own

41 41 Basic Steps in Selecting Appropriate Measures 1. Specify context (research question, target group) 2. Define concept for your study 3. Review potential measures for: a) conceptual match to your definition b) adequate psychometric properties in your target group 5. Pretest potential measures in your target group 6. Choose best ones based on pretest results OR 7. Adapt if necessary to address problems

42 42 1. Specify Context A. Research question and how concept fits research B. Nature of target population C. Practical constraints

43 43 1A. Context: How Concept Fits Research Question u State problem or question being addressed u Describe purpose of measure –Evaluate intervention (outcome) –Describe population –Covariate –Independent variable

44 44 Outcome Measures of Interventions: Entire Study Depends on These u Requires special attention to selecting the best measure that … –taps content areas that the intervention is likely to change –has good variability at baseline, room to improve –has excellent reliability and validity –is appropriate and acceptable to target population –is sensitive to change

45 45 Main Dependent Variable of Non- intervention Studies u Pay special attention to selecting the best measure that … –taps full content of concept –has good variability (variance to predict) –evidence of reliability and validity –is appropriate and acceptable to target population

46 46 1B. Context: Nature of Population u Describe known characteristics of your target population –Age (range, mean) –Range of health states »chronic conditions, frailty –SES (e.g. educational level) –% with literacy problems –Racial/ethnic and language diversity

47 47 1C. Context: Practical Constraints u Time frame for completing study u Personnel available –Research assistants, interviewers u Other costs –Data entry, mailings, phone, coding u Preferred method of administration u Acceptable respondent burden

48 48 Step 2: Define Each Concept For Your Study u Define each concept from your perspective, taking into account –Your study questions –Your target population u For outcome concepts: –Describe how the intervention or independent variables might affect it –Describe specific types of changes you expect

49 49 Define Each Concept (cont.) u Include response dimension in definition (what is it about the concept you are interested in?) –Frequency –Intensity –Proportion of time –Whether they have condition/symptom

50 50 Example: Defining Pain in Your Study u Context: clinical intervention to minimize stomach pain u Define exactly how you expect to reduce pain: –eliminate pain completely? –reduce severity of pain when it occurs? –reduce frequency of pain? –change quality of pain? u Concept you aim to improve varies across these

51 51 Step 3. Review Potential Measures u Identify candidate measures for all domains or concepts in your framework u For health outcomes: –Generic or condition-specific profiles of multiple domains OR measures of single domains u Redundancy OK for now u Do NOT develop your own questions unless it is absolutely necessary

52 52 Locating Specific Measures u Reference databases –Medline, Pubmed, Psychinfo, others u Compendia of measures –Books that compile and review various measures u Web is fast becoming the best resource –Specific measures –Web resources from measurement core u Identify researchers doing work in a field and contact them for their measures

53 53 Review Potential Measures for: u Conceptual appropriateness & relevance –in your study –in target group u Clear scoring rules u Psychometric adequacy in target group(s) u Practicality u Acceptability –To respondents and interviewers

54 54 Conceptual Relevance u Example: you are interested in reports of perceived discrimination in the health care setting u In reviewing measures of discrimination; most are about –Discrimination over the lifecourse –Discrimination in various life settings (work, school) u Not relevant for your purpose

55 55 Psychometric Adequacy for Your Study u In samples similar to yours: –good variability (e.g., no floor or ceiling effects) –low percent of missing data –good reliability –good validity u As an outcome for your planned intervention –responsiveness, sensitivity to change in similar population –able to detect expected magnitude of change

56 56 Limited Data on Measurement Properties of Many Measures u Not easy to find this information u Many studies do not report any psychometric properties –Assume the properties from original study carry over

57 57 Limited Data on Measurement Properties of Many Measures (cont.) u Especially in diverse populations: –Few studies test measures across diverse groups –Even when diverse groups are included in research »sample sizes usually too small to conduct measurement studies by subgroups

58 58 Review Measures for Practicality u Method of administration appropriate for your study u Scoring rules clearly documented, or computer scoring algorithm available u Measure available at cost you can afford u You are allowed to adapt it if necessary u Costs of administration within study resources

59 59 Practical Considerations Once you have decided on the measures, you must think about: Obtaining permission Method of administration Data collection Scoring Availability of translations if needed

60 60 Practical - Scoring u Know ahead of time how you plan to score the items –Count of “correct” answers? –Sum Likert items into a summated scale? u Are scoring instructions or computer scoring programs available? u Can scoring programs be purchased from developers? u Do you have a scoring codebook?

61 61 Review Measures for Availability of Translations if Needed u If you need the questionnaire in another language, are there translations available? –Official (published and tested) –Unofficial (by some other researcher)

62 62 Translation Availability Is the measure available in the language of your target populations ? Yes No Know the method of translation Assess adequacy or quality of translation Perform double translation Use bilingual, bicultural translators

63 63 Review Measures for Acceptability u Acceptability is the ease with which a measure can be used in your setting and population u Acceptability to target population –respondent burden (length, time needed), distress –burden for sickest, oldest, least educated –culturally sensitive u Acceptability to interviewers –interviewer burden –do they like administering the questionnaire? –amount of training needed

64 64 Respondent Burden u Diverse populations may have more difficulty with instruments, take longer to complete u Perceived burden –a function of item difficulty, distress due to content, perceived value of survey, expectations of length –is as important as actual burden

65 65 5. Choose Best Measures to Pretest in Your Target Population u Select best measures for all concepts in your conceptual framework –existing instrument in its entirety –subscales of relevant domains (e.g., only those that meet your needs)

66 66 Pretest u Pretesting essential for priority measures (e.g., outcomes) u Pretest is to identify: –problems with method of administration –unacceptable respondent burden –problems with questions or response choices »Hard to understand, complex, vague –words and phrases that do not mean what you intended to target population

67 67 Types of Pretests u General pretest, small (N=10) u Cognitive interviewing (N=5-10 each group) u Large pretest (N=100) –test measurement properties prior to major study

68 68 General Pretest (Small): Debriefing Pretest u Goal –Find out how well subjects do with the procedures –Estimate time needed to complete instrument –Identify serious problems u Procedures –Subjects answer entire questionnaire –At end, debrief –Close to true task

69 69 Debriefing Questions After Administration of Survey.. Ask respondents: u Were any questions confusing? u Which words were hard to understand? u Which questions were difficult to answer? caused distress? u Was questionnaire too long? u Confusing instructions?

70 70 Problems with General Pretests Respondents… u often don’t understand the task. u don’t want to appear as if they didn’t understand u have a hard time telling you anything was wrong u easier to say everything was fine

71 71 Pretest Several Measures of Same Concept? u If you are unsure about which of several measures will be appropriate for your study –pilot test all you are considering –can use pilot test results to select best one u Saves time –if test only one measure and it has many problems, have to repeat entire process for next candidate measure

72 72 Conduct Pretests in All Diverse Groups Being Included in Your Study u Important to recruit people from each of your target populations –Won’t learn anything if you just recruit friends, persons easy to recruit

73 73 Cognitive Interviewing u Individual in-depth interviews with individuals using open-ended probes to assess –how items are interpreted –adequacy of response choices u Typically 1.5 hr interview

74 74 Cognitive Interviewing Helps You Learn About the 4 Steps in Answering Questions u Interpret and understand the question –as intended by the researchers u Retrieve the information –various schemas used to access memory u Judgment formation - formulate an answer –calculate or judge the correct information u Edit response - decide what to report –is answer embarrassing, socially undesirable?

75 75 Summary u Selecting best measures is critical to validity of research u Very little published information on measurement properties in diverse groups –New area of focus and policy attention –Raises issues of conceptual and psychometric adequacy and equivalence u Pretesting is the most important thing you can do

76 76 C onclusions u Methods described here are “ideal” –Impractical for most researchers u Apply these methods to your most important measures –e.g., outcomes, key independent variables u Keep learning –Good, appropriate measures remain the foundation of excellent research


Download ppt "1 Locating and Assessing the Usefulness of Health Measures for Health Disparities Research Anita L. Stewart, Ph.D. University of California, San Francisco."

Similar presentations


Ads by Google