Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Measurement and Scaling Farzin Madjidi, Ed.D. Pepperdine University Graduate School of Education and Psychology.

Similar presentations


Presentation on theme: "1 Measurement and Scaling Farzin Madjidi, Ed.D. Pepperdine University Graduate School of Education and Psychology."— Presentation transcript:

1 1 Measurement and Scaling Farzin Madjidi, Ed.D. Pepperdine University Graduate School of Education and Psychology

2 2 Variables n Independent – Precedes, influences or predicts results n Dependent – Affected by or predicted by the Independent Variable n Extraneous – Affected by the D.V., but not controlled or measured. Causes error

3 3 Variables n Confounding – An extraneous variable that varies systematically (has a relationship) with the I.V. n Intervening – Unobservable trait that influences behavior (e.g., effect of new intervention on self-esteem may be affected by the motivation level of subjects)

4 4 Variables n Control – Used to eliminate the effect of extraneous variables n Organismic – Aka, measured, or assigned – Characteristics of the subjects that cannot be manipulated

5 5 Levels of Measurements Four levels of Measurements n Nominal – Measures categories n Ordinal – Categories + rank and order n Interval – Equal distance between any two consecutive measures n Ratio – Intervals + meaningful zeros

6 6 Categories of Scales n Categorical (ratings) – Score without comparison - 1 to 5 scales n Comparative (ranking) – Score by comparing - Smartest n Preference – Subjective - which do you prefer n Non-preference – Objective - which solution is less costly

7 7 Categories of Scales n Unidimensional – Involves only one aspect of the measurement – Measurement by one construct n Multi-dimensional – Involves several aspects of a measurement – Uses several dimensions to measure a single construct

8 8 Types of Scales n Likert/Summated Rating Scales n Semantic Differential Scales n Magnitude Scaling n Thruston Scales n Guttman Scales

9 9 Likert Scales n A very popular rating scale n Measures the feelings/degree of agreement of the respondents n Ideally, 4 to 7 points n Examples of 5-point surveys – AgreementSDDND/NAASA – SatisfactionSDDND/NSSSS – QualityVPPAverageGVG

10 10 Summative Ratings n A number of items collectively measure one construct (Job Satisfaction) n A number of items collectively measure a dimension of a construct and a collection of dimensions will measure the construct (Self-esteem)

11 11 Summative Likert Scales n Must contain multiple items n Each individual item must measure something that has an underlying, quantitative measurement continuum n There can be no right/wrong answers as opposed to multiple-choice questions n Items must be statements to which the respondent assigns a rating n Cannot be used to measure knowledge or ability, but familiarity

12 12 Semantic Differential Scales n Uses a set of scale anchored by their extreme responses using words of opposite meaning. n Example: Dark ___ ___ ___ ___ ___ Light Short ___ ___ ___ ___ ___ Tall Evil ___ ___ ___ ___ ___ Good n Four to seven categories are ideal

13 13 Magnitude Scaling n Attempts to measure constructs along a numerical, ratio level scale – Respondent is given an item with a pre- assigned numerical value attached to it to establish a “norm” – The respondent is asked to rate other items with numerical values as a proportion of the “norm” – Very powerful if reliability is established

14 14 Thurston Scales n Thurston Scales – Items are formed – Panel of experts assigns values from 1 to 11 to each item – Mean or median scores are calculated for each item – Select statements evenly spread across the scale

15 15 Thurston Scales n Example: Please check the item that best describes your level of willingness to try new tasks – I seldom feel willing to take on new tasks (1.7) – I will occasionally try new tasks (3.6) – I look forward to new tasks (6.9) – I am excited to try new tasks (9.8)

16 16 Guttman Scales n Also known as Scalograms n Both the respondents and items are ranked n Cutting points are determined (Goodenough- Edwards technique) n Coefficient of Reproducibility (CR eg ) - a measure of goodness of fit between the observed and predicted ideal response patterns n Keep items with CR eg of 0.90 or higher

17 17 Scale Construction n Define Constructs – Conceptual/theoretical basis from the literature – Are their sub-scales (dimensions) to the scale – Multiple item sub-scales – Principle of Parsimony n Simplest explanation among a number of equally valid explanations must be used

18 18 Item Construction n Agreement items – Write declarative statements n Death penalty should be abolished n I like to listen to classical music – Frequency items (how often) n I like to read – Evaluation items n How well did your team play n How well does the police serve your community

19 19 Item Writing n Mutually exclusive and collectively exhaustive items n Use positively and negatively phrased questions n Avoid colloquialism, expressions and jargon n Avoid the use of negatives to reverse the wording of an item – Don’t use: I am not satisfied with my job – Use: I hate my job! n Be brief, focused, and clear n Use simple, unbiased questions

20 20 Sources of Error n Social desirability – Giving politically correct answers n Response sets – All yes, or all no responses n Acquiescence – Telling you what you want to hear n Personal bias – Wants to send a message

21 21 Sources of Error n Response order – Recency - Respondent stops reading once s/he gets to the response s/he likes – Primacy - Remember better the initial choices – Fatigue n Item order – Answers to later items may be affected by earlier items (simple, factual items first) – Respondent may not know how to answer earlier questions

22 22 Assessing Instruments n Three issues to consider – Validity: Does the instrument measure what its supposed to measure – Reliability: Does it consistently repeat the same measurement – Practicality: Is this a practical instrument

23 23 Types of Validity n Face validity – Does the instrument, on its face, appear to measure what it is supposed to measure n Content validity – Degree to which the content of the items adequately represent the universe of all relevant items under study – Generally arrived at through a panel of experts

24 24 Types of Validity n Criterion related – Degree to which the predictor is adequate in capturing the relevant aspects of criterion – Uses Correlation analysis – Concurrent validity n Criterion data is available at the same time as predictor score- requires high correlation between the two – Predictive validity n Criterion is measured after the passage of time n Retrospective look at the validity of the measurement n Known-groups

25 25 Types of Validity n Construct Validity – Measures what accounts for the variance – Attempts to identify the underlying constructs – Techniques used: n Correlation of proposed test with other existing tests n Factor analysis n Multi-trait-multimethod analysis n Convergent validity - Calls for high correlation between the different measures of the same construct n Discriminant validity - Calls for low correlation between sub-scales within a construct

26 26 Types of Reliability n Stability – Test-retest: Same test is administered twice to the same subjects over a short interval (3 weeks to 6 months) – Look for high correlation between the test and retest – Situational factors must be minimized

27 27 Types of Reliability n Equivalence – Degree to which alternative forms of the same measure produce same or similar results – Give parallel forms of the same test to the same group with a short delay to avoid fatigue – Look for high correlation between the scores of the two forms of the test – Inter-rater reliability

28 28 Types of Reliability n Internal Consistency – Degree to which instrument items are homogeneous and reflect the same underlying constructs – Split-half testing where the test is split into two halves that contain the same types of questions – Uses Cronbach’s alpha to determine internal consistency. Only one administration of the test is required – Kuder-Richardson (KR 20 ) for items with right and wrong answers

29 29 Practicality n Is the survey economical n Cost of producing and administering the survey n Time requirement n Common sense! n Convenience n Adequacy of instructions n Easy to administer n Can the measurement be interpreted by others n Scoring keys n Evidence of validity and reliability n Established norms


Download ppt "1 Measurement and Scaling Farzin Madjidi, Ed.D. Pepperdine University Graduate School of Education and Psychology."

Similar presentations


Ads by Google