# Measurement and Scaling

## Presentation on theme: "Measurement and Scaling"— Presentation transcript:

Measurement and Scaling

Measurement Measurement is the process observing and recording the observations that are collected as part of a research effort. the assignment of numbers to aspects of objects or events according to one or another rule or convention

Relationships between CONCEPTS Relationships between VARIABLES
THEORIES DATA Relationships between CONCEPTS Relationships between VARIABLES

CONCEPTUAL DEFINITION
OPERATIONAL DEFINITION concept variable CONCEPTUAL DEFINITION Conceptual definitions help us to be explicit about what our theories are talking about. That way, we know we are measuring the right thing. Conceptual definition: The concept of ……………….is defined as the extent to which…………………. exhibit the characteristic of ………… . Example 1: What would be a good definition for the following concepts? Being politically informed (unit:……….) Being economically developed (unit:………..)

OPERATIONAL DEFINITION
Example: Ecological Fallacy (Berkeley Gender Bias Case) Evidence of (the concept of) gender bias? APPLICANTS ADMITTED Men 8442 44% Women 4321 33%

Operational definitions describe how we convert the concept into the variable numerical codes. What could be a simple example of a operational definition for a sex variable?

1. Systematic measurement error occurs when the operational definition fails to match the conceptual definition in a systematic manner. Lack of systematic error is called validity. 2. Random measurement error occurs when temporary or haphazard factors affect the measurement. Lack of this random error is called reliability.

Isomorphism The sine qua non of measurement is that the numbers assigned to objects reflect the relations among the objects with respect to the aspect being measured. This idea-referred to as isomorphism-means a one-to-one correspondence between elements of two classes. A prime example of isomorphism is that between a map and the geographic region it depicts. There is a one-to-one correspondence between, say, towns and the points used to represent them on the map, such that relations among the points on the map (e.g., distances) reflect relations among the geographic locations they represent. Hence, the great usefulness and convenience of maps.

Measurement consists of mapping a set of objects onto a set of numbers, such that there is isomorphism between the objects measured and the numbers assigned to them. An obvious example is measurement of weight, where each of a set of objects is assigned a number such that the relations among the numbers reflect the relations among the objects with respect to weight (e.g., one object being twice as heavy as another).

An appreciation of the benefits of measurement may be attained when it is contrasted with alternative approaches to the description of or the differentiation among a set of objects with respect to a given aspect. Contrast the limitations, ambiguities, and potential inconsistencies when attempting to describe verbally the weight of a set of objects (e.g., heavy, very heavy, very very heavy, not so heavy, light, lighter, much lighter) with statements about the numerical weights of the same objects.

A great advantage in using measurement is that one may apply the powerful tools of mathematics to the study of phenomena. Operating on sets of numbers that are isomorphic with aspects of sets of objects enables one to arrive at concise and precise statements of regularities, or laws, regarding phenomena to a degree unattainable without the benefits of measurement.

Suppose, for example, that one wants to study and describe the relation between mental ability and achievement. Relying on observations and verbal descriptions, one is limited to unwieldy and potentially ambiguous statements (e.g., people high on mental ability generally manifest greater achievement than those low on mental ability). In contrast, measuring mental ability and achievement of a sample of people, one can calculate an index of the relation between the two variables (e.g., the correlation coefficient), and state the direction and strength of the relation with clarity and conciseness unattainable through verbal descriptions.

Moreover, the index of the relation can in turn be used for various purposes (e.g., to determine whether and by how much the relation between mental ability and achievement differs across various racial groups), or, along with other statistics, it may be used to develop an equation to predict achievement from mental ability.

Scales of Measurement Stevens (1951) proposed the following four types of measurement (also referred to as levels of measurement) in ascending order, from the crudest to the most elaborate: nominal, ordinal, interval, and ratio.

Nominal Scale (1) A nominal scale entails the assignment of numbers as labels to objects or classes of objects. Examples… Referring to the nominal level of measurement, Coombs (1953) stated: "This level of measurement is so primitive that it is not always recognized as measurement, but it is a necessary condition for all higher levels of measurement"

Nominal Scale (2) To satisfy the requirements of nominal scaling, subjects have to be classified into a set of mutually exclusive and exhaustive categories. What this means is that each subject is assigned to one category only and that all subjects are classifiable into the categories used. For example, classifying people according to their political party affiliation, each person is classified as a member of one party only, and each person must fit into one of the categories used.

Ordinal Scale (1) An ordinal scale entails the assignment of numbers to persons or objects so that they reflect their rank ordering on an attribute in question. If person A, say, is viewed as kinder (or smarter, better looking) than person B, then he or she may be assigned a "2," whereas B may be assigned a "1." The numbers thus assigned do not reflect by how much A exceeds B on the attribute in question but rather the relation "greater than," or "more than," symbolized by >.

Ordinal Scale (2) On an ordinal scale, it must be true that for any pair of objects, A and B, if A is greater than B, then B is not greater than A. This is referred to as an asymmetric, or nonsymmetric, relation. It is, of course, possible for A to be equal to B, reflecting a symmetric relation. Under such circumstances, A and B would be assigned the same number, referred to as a tied rank.

Ordinal Scale (3) For any three objects, A, B, and C on an ordinal scale, it must be true that if A > B, and B > C, then A > C. This is referred to as transitivity. An asymmetric relation is not necessarily transitive. For example, person A may beat person B in a game of chess, and B may beat C. From this, it does not follow that A will beat C.

Because the numbers assigned to objects on an ordinal scale reflect only the relation "greater than," invariance will be maintained under any monotonic transformation of the scale values. A monotonic transformation is one in which the rank ordering of the numbers does not change. Following are examples of monotonic transformations: adding a constant to all the numbers, raising the numbers to any power, taking the square root of the numbers, and multiplying the numbers by a +tive constant.

Ordinal Scale (3) Limitations of an ordinal scale as well as the potential for misinterpretation of the scale values will be illustrated through two examples.

#1 Assume that two groups, each consisting of eight people, were rank ordered with respect to height. The results are depicted in Figure below, where the letters above the line refer to the people, and the numbers below the line refer to their rank ordering on height; (a) and (b) refer to the two groups. Notice that the people are not evenly distributed on the height continuum. What is the problem here?

#2 It is obvious that no meaningful comparisons can be made between ranks assigned in separate groups. The fact that two people have the same rank in distinct groups obviously does not mean that they are of the same height. It is possible, for example, for the person ranked as the shortest in group (a) to be taller than the person ranked as the tallest in group (b).

Interval Scale (1) An interval level of measurement is achieved when numbers are assigned to objects so that, in addition to satisfying the requirements of the ordinal level, differences between the numbers may be meaningfully interpreted with respect to the attribute being measured. In other words, on an interval scale, constant units of measurement are used, affording meaningful expressions of differences 'between objects, comparisons of such differences, as well as the conversion of differences into ratios.

Interval Scale (2) The example of an interval scale most often given is that of a measure of temperature. On a Celsius scale, for example, 60° centigrade is not merely more than 50°, but it is 10° more. Because the units on the scale are constant, it is also true that the difference between 60° and 50° is equal to the difference between, say, 90° and 80°, or the difference between 60° and 50° is twice that between 37° and 32°. An interval scale is invariant under linear (affine) transformations: 𝑋 ′ =𝑎+𝑏𝑋 Consider converting celcius to fahrenheit.

Interval Scale (3) Note carefully that although it is meaningful to express differences in scores on an interval scale as ratios, it is not meaningful to do so for the scores themselves. The reason is that the zero point on an interval scale is arbitrary, hence the admissibility of adding a constant to scores on such a scale.

Interval Scale (4) Turning to examples of sociobehavioral measures, consider the following: (a) On an interval scale of intelligence, individual A has a score of 120 and individual B has a score of 60. The zero point on the intelligence scale is necessarily arbitrary (how would one define zero intelligence, in an absolute sense? As being dead?); thus, it is erroneous to conclude that person A is twice as intelligent as B.

Interval Scale (5) (b) On an interval scale of achievement in social studies, person A answered correctly 60 multiple-choice items and person B answered correctly 15 such items. Although it is true that person A answered correctly four times as many items as B, this does not mean that A knows four times as much in social studies as does B.

Ratio Scale A ratio level of measurement is achieved when, in addition to the requirements of the interval level, a true, or absolute, zero point can be determined. That is, zero means no amount of the attribute measured. The term ratio refers to the fact that, on such a scale, the ratio of any two scores is independent of the units of the scale. Ratio scales are not often encountered in sociobehavioral sciences, although they are not unheard of. The measurement of reaction time (e.g., on perceptual-motor tasks) is an example of a ratio scale used in psychological research.

Levels of Measurement and Method of Analysis
The literature on the relation between levels of measurement and statistics is extensive, with some authors strongly defending and expounding Stevens's position, and others rejecting it. Steven argued that means and standard deviations should not be calculated for measures that are on an ordinal level. (What do you think?)

The major source of the controversy regarding measurement and statistics in sociobehavioral research is whether most of the measures used are on an ordinal or an interval level. The pragmatists (e.g., Borgatta, 1968; Borgatta & Bohmstedt, 1981; Gardner, 1975; Labovitz, 1967, 1972; Nunnally, 1978) argued cogently that, although most measures used in sociobehavioral research are not clearly on an interval level, they are not strictly on an ordinal level either. In other words, most of the measures used are not limited to signifying "more than," or "less than," as an ordinal scale is, but also signify degrees of differences, although these may not be expressible in equal interval units. Prime examples are summated measures of achievement, mental ability, attitudes, and the like. Such measures occupy an intermediate, "grey" (Gardner, 1975, p. 53) region between an interval and an ordinal level, and to treat them as if they were on an ordinal level may lead to a serious loss of information.

From Ordinal to Interval –unidimensional scaling
Likert Scale Rensis Likert was an American psychologist. What became known as the Likert method of attitude measurement was formulated in his doctoral thesis, and an abridged version appeared in a 1932 article in the Archives of Psychology. At the time, many psychologists believed that their work should be confined to the study of observable behaviour, and rejected the notion that unobservable (or ‘latent’) phenomena like attitudes could be measured. Like his contemporary, Louis Thurstone, Likert disagreed.

They argued that attitudes vary along a dimension from negative to positive, just as heights vary along a dimension from short to tall, or wealth varies from poor to rich. For Likert, the key to successful attitude measurement was to convey this underlying dimension to survey respondents, so that they could then choose the response option that best reflects their position on that dimension.

Research confirms that data from Likert items (and those with similar rating scales) becomes significantly less accurate when the number of scale points drops below five or above seven. The standard practice, again following Likert’s original example, is to include a neutral midpoint. While Likert labelled this point as ‘Undecided’, the more common version is now ‘Neither agree nor disagree’. The purpose of this option is evidently to avoid forcing respondents into expressing agreement or disagreement when they may lack such a clear opinion. Not only might this annoy respondents, but it also risks data quality.

Generate Likert Items (lots of them)
Have a group of judges to rate the items - Notice that, as in other scaling methods, the judges are not telling you what they believe; they are judging how favorable each item is with respect to the construct of interest. Ex: If the focus is to measure attitudes that people might have towards persons with AIDS then you want judges to rate the "favorableness" of each statement in terms of an attitude towards AIDS, where 1 = "extremely unfavorable attitude towards people with AIDS" and 5 = "extremely favorable attitude towards people with AIDS.“ Like: -People with AIDS deserve what they got (1) -If you have AIDS you can still lead a normal life (4) -People with AIDS should be treated just like everybody else. (5) and so on…

Select the items for the scale
Throw out any items that have a low correlation with the total (summed) score across all items In most statistics packages it is relatively easy to compute this type of Item-Total correlation. First, you create a new variable which is the sum of all of the individual items for each respondent. Then, you include this variable in the correlation matrix computation (if you include it as the last variable in the list, the resulting Item-Total correlations will all be the last line of the correlation matrix and will be easy to spot). How low should the correlation be for you to throw out the item? There is no fixed rule here -- you might eliminate all items with a correlation with the total score less that .6, for example. (This is called internal consistency – Cronbach alpha)

Select the items for the scale
For each item, get the average rating for the top quarter of judges and the bottom quarter. Then, do a t-test of the differences between the mean value for the item for the top and bottom quarter judges. Higher t-values mean that there is a greater difference between the highest and lowest judges. In more practical terms, items with higher t-values are better discriminators, so you want to keep these items. In the end, you will have to use your judgment about which items are most sensibly retained. You want a relatively small number of items on your final scale (e.g., 10-15) and you want them to have high Item-Total correlations and high discrimination (e.g., high t-values).

DO NOT FORGET! Likert scales are ‘summated’ scales, so called because a respondent’s answers on each question are summed to give their overall score on the attitude or value.