Measurement & Data Collection Polit, D.F. & Beck, C.T. (2012) Nursing Research: Generating and Assessing Evidence for Nursing Practice (9 th ed.) Philadelphia:

Similar presentations


Presentation on theme: "Measurement & Data Collection Polit, D.F. & Beck, C.T. (2012) Nursing Research: Generating and Assessing Evidence for Nursing Practice (9 th ed.) Philadelphia:"— Presentation transcript:

1 Measurement & Data Collection Polit, D.F. & Beck, C.T. (2012) Nursing Research: Generating and Assessing Evidence for Nursing Practice (9 th ed.) Philadelphia: Wolters Kluwer, Chapter 14, Measurement & Data Quality. Houser, J. (2015) Nursing Research: Reading, Using, and Creating Evidence (3rd ed.) Sudbury, MA: Jones & Barlett Learning, Chapter 8, Measurement Strategies.

2 Measurement Determination of the quantity of a characteristic that is present; involves assignment of numbers or some other classification to an individual attribute.

3 Measurement based on rules about assigning numbers in an unbiased way. Numbers are key for several reasons: –Objective –Standardized – or consistent –Universal language –Statistical test applied Only useful if they really represent the characteristic.

4 The Measurement Strategy Determine the most relevant attributes Define attributes in operational terms Select an instrument to reliably capture the attribute Document the reliability and validity of the instrument Develop protocols for data gathering Put quality checks in place

5 Measurement Concepts Conceptual definitions Operational definitions Primary data Secondary data

6 Measurement Concepts Define the research variables: describing the characteristics in a way that is measurable. –Conceptual definitions: describes the concept using other concepts. Spirituality: A multidimensional, pervasive quality of the inner person that is uniquely interpreted by each individual. Spirituality may be characterized by an individual’s beliefs; search for meaning and purpose of life; sense of transcendence; and relationships with self, others, and/or a Supreme Being (Burkhardt, 1989; Dyson et al., 1997; Emblen, 1992; Martsolf & Mickley, 1998). –Operational definitions: operations that must be preformed to represent the concept Spirituality: Score on the Jarel SWB scale

7 Validity of Secondary Data Data that are needed are present in the record The data are in a form that meets operational definitions Data were accurately recorded Multiple entries are consistent Data are interpreted in a common way

8 Levels of Measurement There are four levels (classes) of measurement: –Nominal ( assigning numbers to classify characteristics into categories) –Ordinal ( ranking objects based on their relative standing on an attribute) –Interval ( objects ordered on a scale that has equal distances between points on the scale) –Ratio ( equal distances between score units; there is a rational, meaningful zero) A variable’s level of measurement determines what mathematic operations can be performed in a statistical analysis.

9 Nominal Level of Measurement –Lowest level of measurement –Assigning numbers to categories –Number assignment “labels” category but does not indicate order or magnitude –Example Gender has 2 categories –0 = Male 1 = Female Reason for ER Visit –1 = cold/flu2 = Cardiac3 = Trauma –One category is not higher or lower than another –Number assigned is for labeling purposes only

10 Ordinal Level of Measurement –Second level of measurement –Assigns numbers to categories but there is a rank order HOWEVER the exact differences between the categories cannot be determined. –Example: Anxiety – Low (score 50) Higher score more anxiety Level of Education – 1 -4 years, higher number more education

11 Interval Level of Measurement –Rank order with specified distance between measures –“real” numbers on a scale –Provides more depth to data analysis. –Can calculate a mean score –Limitation: Does not have an absolute zero point. Difference between points makes sense but not a true ratio –Example: Temperature. “0” does not mean absence of temperature. 80 degrees – 70 = 10degrees. Makes sense, but a ratio 40 degrees to 80 degrees not necessarily 2 X as hot

12 Ratio Level of Measurement –Highest level of measurement –Data can be: Categorized Ranked Distance between points is specified A zero point can be identified. Age, weight, volume For purposes of data analysis interval & ratio are treated the same.

13 Types of Data Categorical DataContinuous Data NominalOrdinalIntervalRatio Low Med High Blue Green Red 1234512345 0123401234

14 Errors of Measurement Obtained Score = True score ± Error –Obtained score: An actual data value for a participant (e.g., anxiety scale score) –True score: The score that would be obtained with an infallible measure –Error: The error of measurement, caused by factors that distort measurement

15 Measurement Error Systematic error Random error may be due to –Human factors –Instrumentation –Bias –Researcher –Procedural variations

16 Measurement Error (Cont.) Difference between the true score and the measured score –Threat to internal validity –Present in every instrument to some degree Random error: –does not effect avg score in data set. Not reproducible - random Systematic Error: –more serious error b/c looks accurate but is not. –The incorrect measure is done consistently – over and over on all subjects

17 Factors That Contribute to Errors of Measurement Situational contaminants –Time of day, temperature Transitory personal factors – fatigue, hunger, mood Response-set biases –characteristics of subjects (social desirablity) Administration variations –Changes in how the data is collected –one person collects before meals another collects after meals… Item sampling –Which items included will influence outcomes (how hard are test items?)

18 Minimizing Error For physical instruments: calibration Create measures closely linked to concepts Simple, clear procedures Use more than one measurement tool – if two instruments get the same reading…..

19 Psychometric Assessments A psychometric assessment is an evaluation of the quality of a measuring instrument. Key criteria in a psychometric assessment: –Reliability –Validity –Note: Not all instruments (questionnaires, surveys, etc.) used in nursing research have had a psychometric assessment

20 Reliability Internal reliability: Stability within in instrument Item-total correlation: Stability among individuals Inter-rater reliability: Stability between raters Test-retest: Stability over time

21 Three Aspects of Reliability Can Be Evaluated Stability Internal consistency Equivalence

22 Stability The extent to which scores are similar on two separate administrations of an instrument Evaluated by test–retest reliability –Requires participants to complete the same instrument on two occasions –Appropriate for relatively enduring attributes –Cohen’s kappa 0.85

23 Internal Consistency The extent to which all the items on an instrument are measuring the same unitary attribute Evaluated by administering instrument on one occasion Appropriate for most multi-item instruments The most widely used approach to assessing reliability Assessed by computing coefficient alpha (Cronbach’s alpha) Alphas ≥.80 are highly desirable.

24 Equivalence The degree of similarity between alternative forms of an instrument or between multiple raters/observers using an instrument Most relevant for structured observations Assessed by comparing agreement between observations or ratings of two or more observers (interobserver/interrater reliability)

25 Reliability Principles Low reliability can undermine adequate testing of hypotheses. Reliability estimates vary depending on procedure used to obtain them. Reliability is lower in homogeneous than heterogeneous samples. Reliability is lower in shorter than longer multi- item scales.

26 Reliability Findings that Should be Reported At least test of reliability should be reported Gold standard: –Chronbach’s –Inter-rater –Test-retest The instrument is only as good as it’s reliability. If not reliable then may not be the true score See Houser Table 8.4 pg. 202

27 Reliability The consistency and accuracy with which an instrument measures the target attribute Reliability assessments involve computing a reliability coefficient. –Reliability coefficients can range from.00 to 1.00. –Coefficients below.70 are considered unsatisfactory. –Coefficients of.80 or higher are desirable.

28 Validity The degree to which an instrument measures what it is supposed to measure Measurement of the “right” thing We would be “reliably”/consistently measuring the wrong thing! Often reliability is reported, but not validity Harder to test! Needs to be done on multiple populations, settings and situations Correlation of co-efficent used to report validity: 0.5 or higher is strong but 0.2-0.4 maybe acceptable

29 Validity (Cont.) Content validity: The content of the instrument reflects the attribute Construct validity: The instrument represents the conceptual issues Criterion-related validity: –Concurrent –Predictive –Discriminate –See Houser Table 8.6 pg. 205

30 Face Validity Refers to whether the instrument looks as though it is an appropriate measure of the construct Based on judgment; no objective criteria for assessment

31 Content Validity The degree to which an instrument has an adequate sample of items for the construct being measured Evaluated by expert evaluation, often via a quantitative measure—the content validity index (CVI)

32 Criterion-Related Validity The degree to which the instrument is related to an external criterion Validity coefficient is calculated by analyzing the relationship between scores on the instrument and the criterion. Types: –Predictive validity: the instrument’s ability to distinguish people whose performance differs on a future criterion Predicts future –ATI predicts passage on NCLEX

33 Concurrent & Discriminate Validity –Concurrent validity: the instrument’s ability to distinguish individuals who differ on a present criterion Concurrent: Two scales measure same thing. –Old Depression scale/New Depression Scale –Loneliness/Depression –Discriminate validity: able to distinguish between two characteristics (distinguishes between two diseases) –Depression Scale/Happiness Scale

34 Construct Validity Concerned with these questions: –What is this instrument really measuring? –Does it adequately measure the construct of interest?

35 Some Methods of Assessing Construct Validity Known-groups technique Testing relationships based on theoretical predictions Factor analysis –Factor 1: Acceptance 10 items “loaded” –I am forgiving of myself for past mistakes –I feel forgiving of those who have harmed me –Factor 2: Inner haven 8 items “loaded” –I am aware of an inner source of comfort –I experience peace of mind

36 Screening/Diagnostic Instruments Sensitivity: the instruments’ ability to correctly identify a “case”—i.e., to diagnose a condition Example stress tests have high sensitivity & result in a large number of false positives Specificity: the instrument’s ability to correctly identify noncases, that is, to screen out those without the condition Example cardiac cath has high specificity and has a low number of false positives Note: Usually give up one for the other. Difficult to find a perfect balance.

37 Qualitative Research Not concerned with a Measurement Strategy The researcher is the measurement instrument Findings are in words not numbers Interested in truth Concerned with sound data collection

38 Effective Data Collection must be… Clear Unbiased Reliable Valid A design that answers the question

39 Major Types of Data Collection Methods Self-reports – Questionnaires, surveys, focus groups Observation – Assessment, observation checklists, competency checklist, field notes Biophysiologic measures – vital signs, lab values, etc. Note: Psychometric instrumentation means the tool (questionnaire/survey) has undergone psychometric testing which is a rigorous series of tests on reliability & validity. Not all nursing tools have undergone such testing

40 Types of data collection Primary data collection –Prospective –Specific to the research study Secondary data collection –Retrospective –Usually collected for a different purpose, ie vital statistics –See Table 8.7 pgs 211- 212

41 Examples of Records, Documents, and Available Data Hospital records (nurses’ shift reports) School records (student absenteeism) Corporate records (health insurance choices) Letters, diaries, minutes of meetings, etc. Photographs

42 Dimensions of Data Collection Approaches Structure –Data collected from all subjects structured the same way Quantifiability –Collected in a way that it can be converted to a number and analyzed statistically (Quantitative) Researcher obtrusiveness –How aware are subjects of the researcher? Objectivity – – desirable in quantitative research

43 Steps to Developing a Sound Data Collection Approach 1.Define purpose for collecting data 2.Select feasible data collection approach 3.Select delivery method appropriate for design and subject 4.Write protocols for collecting data (realistic, reliable, and thorough)

44 5. Design form and instruments 6. Clear instructions and training for staff in data collection methods 7. Develop a plan to manage and analyze data Steps to Developing a Sound Data Collection Approach

45 Data Collection Plan Basic decision is use of: Primary: –New data, collected specifically for research purposes, or Secondary –Existing data Records (patient charts) Historical data Existing data set (secondary analysis )

46 Structured Self-Reports (Surveys) Data are collected with a formal instrument. –Interview schedule Questions are prespecified but asked orally. Either face-to-face or by telephone –Questionnaire Questions prespecified in written form, to be self-administered by respondents

47 Advantages of Interviews Compared with Questionnaires Higher response rates with interviews Usually lower costs with questionnaires Interviews appropriate for more diverse audiences Interviews allow more opportunities to clarify questions or to determine comprehension Interviews allow more opportunity to collect supplementary data through observation, ie. body language Questionnaires allow for more privacy or anonymity Questionnaires lack interviewer bias

48 Kinds of Questions Knowledge questions Opinion questions Application questions Analysis questions Synthesis questions

49 Survey Considerations Clarity of questions Reading level of subjects Length of survey Analysis method planned Start with least threatening questions Limit questions to a single concept Provide well written cover letter/ instructions See Houser Table 8.1 pg. 194

50 Types of Questions in a Structured Instrument Closed-ended (fixed alternative) questions –“Within the past 6 months, were you ever a member of a fitness center or gym?” (yes/no) Open-ended questions –“Why did you decide to join a fitness center or gym?” See Table 8.2 pg. 196 & Table 8.3 pg 198

51 Specific Types of Closed-Ended Questions Dichotomous questions –Yes/no; male/female Multiple-choice questions Forced-choice questions –Which statement most closely represents you view? What happens to me in research class is my own doing What happens to me in research class is Dr. Creel’s fault!

52 Composite Psychosocial Scales Scales—used to make fine quantitative discriminations among people with different attitudes, perceptions, traits Likert scales—summated rating scales Semantic differential scales Guttman scale Visual analog scale (VAS)

53 Likert Scales Consist of several declarative statements (items) expressing viewpoints Responses are on an agree/disagree continuum (usually 5 or 7 response options). Responses to items are summed to compute a total scale score. See Houser Table 8.3 pg 198

54 Semantic Differential Scales Require ratings of various concepts Rating scales involve bipolar adjective pairs, with 7-point ratings. Ratings for each dimension are summed to compute a total score for each concept.

55 Example of a Semantic Differential

56 Guttman Scale Set of items on a contiuum or statements ranging from one extreme to another. Responses are progressive and cumulative See Houser Table 8.3 pg 198 for example

57 Visual Analog Scale (VAS) Used to measure subjective experiences (e.g., pain, nausea) Measurements are on a straight line measuring 100 mm End points labeled as extreme limits of sensation

58 Example of Visual Analog Scale

59 Response Set Biases Biases reflecting the tendency of some people to respond to items in characteristic ways, independently of item content Examples: –Social desirability response set bias – answer in a way that is socially acceptable –Extreme response set – answer to shock the researcher –Acquiescence response set (yea- sayers) – answer to please researcher (agree) –Nay-sayers response set – answer to disagree or antagonize researcher

60 Evaluation of Self-Reports Strong on directness Allows access to information otherwise not available to researchers But can we be sure participants actually feel or act the way they say they do?

61 Phenomena Amenable to Research Observation Activities and behavior Characteristics and conditions of individuals Skill attainment and performance Verbal and nonverbal communication Environmental characteristics

62 Structured Observations Category systems  Checklists –Formal systems for systematically recording the incidence or frequency of prespecified behaviors or events –Systems vary in their exhaustiveness Exhaustive system: All behaviors of a specific type recorded, and each behavior is assigned to one mutually exclusive category Nonexhaustive system: Specific behaviors, but not all behaviors, recorded

63 Observational Rating Scales Ratings are on a descriptive continuum, typically bipolar Ratings can occur: –at specific intervals –upon the occurrence of certain events –after an observational session (global ratings)

64 Observational Sampling Time-sampling—sampling of time intervals for observation Examples: Random sampling of intervals of a given length Systematic sampling of intervals of a given length Event sampling—observation of integral events Example: Events during a cardiac arrest or delivery

65 Evaluation of Observational Methods Excellent method for capturing many clinical phenomena and behaviors Potential problem of reactivity when people are aware that they are being observed Risk of observational biases—factors that can interfere with objective observation

66 Biophysiologic Measures In vivo measurements –Performed directly within or on living organisms (blood pressure measures) In vitro measurements –Performed outside the organism’s body (urinalysis)

67 Biological Measures Considerations The procedure for obtaining the measures should be clear, ie obtaining a blood pressure step by step to ensure uniformity Equipment should be named Same equipment for the same measurement should be used Mention of calibration represents reliabiity

68 Evaluation of Biophysiologic Measures Strong on accuracy, objectivity, validity, and precision May be cost-effective for nurse researchers But caution may be required for their use, and advanced skills may be needed for interpretation.

69 Critiquing Measurement & Data Collection Labeled: Methods, Measurement, Instruments Report on reliability/validity when instrument was used in the past and on the population of this study Remember instruments should be re-evaluated if used in different populations, for a different problem or in a different setting. If a new instrument is used – a pilot study should have been done to test reliability/validity Usually best to use a proven tool than try to develop a new instrument

70 Critiquing Measurement & Data Collection (Cont.) Methods; Procedures are specific enough for replicationMethods; Procedures are specific enough for replication Researcher should identify if primary/secondary data usedResearcher should identify if primary/secondary data used What was collected, how, who – training?What was collected, how, who – training? Psychometric properties are identified for the instruments used (reliability & validity)Psychometric properties are identified for the instruments used (reliability & validity) If psychometric properties not identified the method of instrument development & testing is describedIf psychometric properties not identified the method of instrument development & testing is described


Download ppt "Measurement & Data Collection Polit, D.F. & Beck, C.T. (2012) Nursing Research: Generating and Assessing Evidence for Nursing Practice (9 th ed.) Philadelphia:"
Ads by Google