Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Measurement of Pollution Fred Roberts, DIMACS, Rutgers University.

Similar presentations


Presentation on theme: "1 Measurement of Pollution Fred Roberts, DIMACS, Rutgers University."— Presentation transcript:

1 1 Measurement of Pollution Fred Roberts, DIMACS, Rutgers University

2 2 Thank you, Bill Lucas, for All You Have Done for So Many of Us

3 3 Some Questions We Will Ask There are many pollutants in the air. Is it possible to find one combined index of air pollution that takes into account all of them?

4 4 Some Questions We Will Ask Is an airplane louder than a motorcycle? Is it noiser? What is the difference?

5 5 Some Questions We Will Ask Given two devices to measure changes in water pollution level, which one does a better job?

6 6 MEASUREMENT We will observe that all of these questions have something to do with measurement. We will provide an introduction to the theory of measurement, and in particular the theory of scale types. We will introduce the theory of meaningful statements. We will apply the theory to measurement of air, water, and noise pollution.

7 7 Outline 1.Introduction to Measurement Theory 2.Theory of Uniqueness of Scales of Measurement/Scale Types 3.Meaningful Statements 4.Averaging Judgments of Loudness 5.Measurement of Air Pollution: A Combined Pollution Index 6.Evaluation of Water Testing Equipment: “Merging Normalized Scores” 7.Measurement of Noise: Introduction to Psychophysical Scaling 8.Excursis: Functional Equations 9.The Possible Psychophysical Laws/The Power Law 10.How to Average Scores

8 8 MEASUREMENT Measurement has something to do with numbers. Our approach: “Representational theory of measurement” Assign numbers to “objects” being measured in such a way that certain empirical relations are “preserved.” In measurement of temperature, we preserve a relation “warmer than.” In measurement of mass, we preserve a relation “heavier than.”

9 9 MEASUREMENT A: Set of Objects R: Binary relation on A aRb  a is “warmer than” b aRb  a is “heavier than” b f: A   aRb  f(a) > f(b) R could be preference. Then f is a utility function (ordinal utility function). R could be “louder than.” Then f is a measure of loudness.

10 10 MEASUREMENT A: Set of Objects R: Binary relation on A aRb  a is “warmer than” b aRb  a is “heavier than” b f: A   aRb  f(a) > f(b) With mass, there is more going on. There is an operation of combination of objects and mass is additive. a  b means a combined with b. f(a  b) = f(a) + f(b).

11 11 Homomorphisms Relational System: = (A, R 1,R 2,…,R p,  1,  2,…,  q ) A = set R i = relation on A (not necessarily binary)  j = operation on A Homomorphism from into = (B, R 1 *,R 2 *,…,R p *,  1 *,  2 *,…,  q *) is a function f:A  B such that (a 1,a 2,…,a )  R i  (f(a 1 ),f(a 2 ),…,f(a ))  R i * f(a  j b) = f(a)  j *f(b). m i m i ABA

12 12 Homomorphisms (This only makes sense if R i and R i * are both m i -ary.) Examples of Homomorphisms Example 1: Let a mod 3 be the number b in {0,1,2} such that a  b (mod 3). Let A = {0,1,2,…,26} and aRb  a mod 3 > b mod 3. Then f(a) = a mod 3 is a homomorphism from (A,R) into ( ,>). f(11) = 2, f(6) = 0,...

13 13 Homomorphisms Example 2: A = {a,b,c,d} R = {(a,b),(b,c),(a,c),(a,d),(b,d),(c,d)} f(a) = 4, f(b) = 3, f(c) = 2, f(d) = 1. f is a homomorphism from (A, R) into ( ,>) Second homomorphism: g(a) = 10, g(b) = 4, g(c) = 2, g(d) = 0.

14 14 Homomorphisms Example 3: A = {a,b,c}, R = {(a,b),(b,c),(c,a)} There is no homomorphism f:(A,R)  ( ,>) aRb  f(a) > f(b) bRc  f(b) > f(c) Therefore, f(a) > f(c). But cRa  f(c) > f(a).

15 15 Homomorphisms Example 4: A = {0,1,2,…}, B = {0,2,4,…} A homomorphism f:(A,>,+)  (B,>,+): f(a) = 2a a > b  f(a) > f(b) f(a + b) = f(a) + f(b). Second homomorphism: g(a) = 4a: A homomorphism does not have to be onto.

16 16 Homomorphisms Example 5: A homomorphism f:( ,>,+)  (  +,>,  ) f(a) = e a a > b  f(a) > f(b) f(a+b) = e a+b = e a  e b = f(a)  f(b)

17 17 Homomorphisms Temperature Measurement: homomorphism (A,R)  ( ,>). Mass Measurement: homomorphism (A,R,  )  ( ,>,+). Since negative numbers do not arise in the case of mass, we can think of the homomorphism as being into (  +,>,+). The Two Problems of Representational Measurement Theory Representation Problem: Find conditions on (necessary) and sufficient for the existence of a homomorphism from into. Subproblem: Find an algorithm for obtaining the homomorphism AAB

18 18 Homomorphisms If f is a homomorphism from into, we call (,,f) a scale. A representation theorem is a solution to the representation problem. Uniqueness Problem: Given a homomorphism from into, how unique is it? A uniqueness theorem is a solution to the uniqueness problem. AABBAB

19 19 Homomorphisms A representation theorem gives conditions under which measurement can take place. A uniqueness theorem is used to classify scales and to determine what assertions arising from scales are meaningful in a sense we shall make precise. We will concentrate on uniqueness and meaningfulness. We will return to representation later (time permitting).

20 20 Outline 1.Introduction to Measurement Theory 2.Theory of Uniqueness of Scales of Measurement/Scale Types 3.Meaningful Statements 4.Averaging Judgments of Loudness 5.Measurement of Air Pollution: A Combined Pollution Index 6.Evaluation of Water Testing Equipment: “Merging Normalized Scores” 7.Measurement of Noise: Introduction to Psychophysical Scaling 8.Excursis: Functional Equations 9.The Possible Psychophysical Laws/The Power Law 10.How to Average Scores

21 21 The Theory of Uniqueness Admissible Transformations An admissible transformation sends one acceptable scale into another. Centigrade  Fahrenheit Kilograms  Pounds In most cases one can think of an admissible transformation as defined on the range of a homomorphism. Suppose f is a homomorphism from into.  :f(A)  B is called an admissible transformation of f if  f is again a homomorphism from into. BBAA

22 22 The Theory of Uniqueness Centigrade  Fahrenheit:  (x) = (9/5)x + 32 Kilograms  Pounds:  (x) = 2.2x Example: ( ,>)  ( ,>) by f(x) = 2x. It is a homomorphism: a > b  f(a) > f(b) Consider  :f(  )   by  (x) = 2x+10. j is admissible: j  f(x) = 4f(x) + 10. a > b   f(a) >  f(b)

23 23 The Theory of Uniqueness Can every homomorphism g be obtained from a homomorphism f by an admissible transformation of scale? No. (Semiorders) A = {0,1/2,10}, aRb  a > b + 1 B = {0,1/2,10}, aSb  a > b + 1 Two homomorphisms (A,R) into (B,S): f(0) = f(1/2) = 0, f(10) = 10 g(0) = 0, g(1/2) = 1/2, g(10) = 10. There is no  :f(A)  B such that g =  f

24 24 The Theory of Uniqueness A homomorphism f is called irregular if there is a homomorphism g not attainable from f by an admissible transformation. Irregular scales are unusual and annoying. We shall assume that all scales are regular. Most common scales of measurement are. In the case where all scales are regular, we can develop the theory of uniqueness by studying admissible transformations. A classification of scales is obtained by studying the class of admissible transformations associated with the scale. This defines the scale type. (S.S. Stevens)

25 25 Some Common Scale Types Class of Adm. Transfs.Scale TypeExample  (x) =  x,  > 0ratioMass Temp. (Kelvin) Time (intervals) Loudness (sones)? Brightness (brils)? ______________________________________________  (x) =  x+ ,  > 0intervalTemp(F,C) Time (calendar) IQ tests (standard scores)?

26 26 Some Common Scale Types Class of Adm. Transfs.Scale TypeExample x  y   (x)   (y)  strictly increasingordinalPreference? Hardness Grades of leather, wool, etc. IQ tests (raw scores)? _________________________________________  (x) = xabsoluteCounting

27 27 Outline 1.Introduction to Measurement Theory 2.Theory of Uniqueness of Scales of Measurement/Scale Types 3.Meaningful Statements 4.Averaging Judgments of Loudness 5.Measurement of Air Pollution: A Combined Pollution Index 6.Evaluation of Water Testing Equipment: “Merging Normalized Scores” 7.Measurement of Noise: Introduction to Psychophysical Scaling 8.Excursis: Functional Equations 9.The Possible Psychophysical Laws/The Power Law 10.How to Average Scores

28 28 Meaningful Statements In measurement theory, we speak of a statement as being meaningful if its truth or falsity is not an artifact of the particular scale values used. Assuming all scales involved are regular, one can use the following definition (due to Suppes 1959 and Suppes and Zinnes 1963). Definition: A statement involving numerical scales is meaningful if its truth or falsity is unchanged after any (or all) of the scales is transformed (independently?) by an admissible transformation.

29 29 Meaningful Statements If some scales are irregular, we have to modify the definition: Alternate Definition: A statement involving numerical scales is meaningful if its truth or falsity is unchanged after any (or all) of the scales is (independently?) replaced by another acceptable scale. We shall usually disregard the complication. (But it does arise in practical examples, for example those involving preference judgments or judgments “louder than” under the semiorder model.) Even this alternative definition is not always applicable in some situations. However, it is a useful definition for a lot of situations and we shall adopt it..

30 30 Meaningful Statements “This talk will be three times as long as the next talk.” Is this meaningful?

31 31 Meaningful Statements “This talk will be three times as long as the next talk.” Is this meaningful? I hope not!

32 32 Meaningful Statements “This talk will be three times as long as the next talk.” Is this meaningful? Me too

33 33 Meaningful Statements “This talk will be three times as long as the next talk.” Is this meaningful? We have a ratio scale (time intervals). (1)f(a) = 3f(b). This is meaningful if f is a ratio scale. For, an admissible transformation is  (x) =  x,  > 0. We want (1) to hold iff (2) (  f)(a) = 3(  f)(b) But (2) becomes (3)  f(a) = 3  f(b) (1)  (3) since  > 0.

34 34 Meaningful Statements “The high temperature today was three times the high temperature yesterday.” Is this meaningful?

35 35 Meaningful Statements “The high temperature today was three times the high temperature yesterday.” Meaningless. It could be true with Fahrenheit and false with Centigrade, or vice versa.

36 36 Meaningful Statements In general: For ratio scales, it is meaningful to compare ratios: f(a)/f(b) > f(c)/f(d) For interval scales, it is meaningful to compare intervals: f(a) - f(b) > f(c) - f(d) For ordinal scales, it is meaningful to compare size: f(a) > f(b)

37 37 Meaningful Statements “I weigh 1000 times what the Statue of Liberty weighs.” Is this meaningful?

38 38 Meaningful Statements “I weigh 1000 times what the Statue of Liberty weighs.” Meaningful. It involves ratio scales. It is false no matter what the unit. Meaningfulness is different from truth. It has to do with what kinds of assertions it makes sense to make, which assertions are not accidents of the particular choice of scale (units, zero points) in use.

39 39 Outline 1.Introduction to Measurement Theory 2.Theory of Uniqueness of Scales of Measurement/Scale Types 3.Meaningful Statements 4.Averaging Judgments of Loudness 5.Measurement of Air Pollution: A Combined Pollution Index 6.Evaluation of Water Testing Equipment: “Merging Normalized Scores” 7.Measurement of Noise: Introduction to Psychophysical Scaling 8.Excursis: Functional Equations 9.The Possible Psychophysical Laws/The Power Law 10.How to Average Scores

40 40 Average Loudness Study two groups of machines. f(a) is the loudness of machine a. Data suggests that the average loudness of machines in the first group is better than the average loudness of machines in the second group. a 1, a 2, …, a n machines in first group b 1, b 2, …, b m machines in second group. (1) We are comparing arithmetic means. 1 n n X i = 1 f ( a i ) > 1 m m X i = 1 f ( b i )

41 41 Average Loudness Statement (1) is meaningful iff for all admissible transformations of scale , (1) holds iff (2) Some argue that loudness (sones) define a ratio scale. (More on this later.) Thus,  (x) =  x,  > 0, so (2) becomes (3) Then  > 0 implies (1)  (3). Hence, (1) is meaningful. 1 n n X i = 1 ' ± f ( a i ) > 1 m m X i = 1 ' ± f ( b i ) 1 n n X i = 1 ® f ( a i ) > 1 m m X i = 1 ® f ( b i )

42 42 Average Loudness Note: (1) is still meaningful if f is an interval scale. For example, we could be comparing temperatures f(a). Here,  (x) =  x + ,  > 0. Then (2) becomes (4) This readily reduces to (1). However, (1) is meaningless if f is just an ordinal scale. 1 n n X i = 1 [ ® f ( a i ) + ¯ ] > 1 m m X i = 1 [ ® f ( b i ) + ¯ ]

43 43 Average Loudness To show that comparison of arithmetic means can be meaningless for ordinal scales, suppose we are asking experts for a subjective judgment of loudness. Suppose f(a) is measured on an ordinal scale, e.g., 5- point scale: 5=extremely loud, 4=very loud, 3=loud, 2=slightly loud, 1=quiet. In such a scale, the numbers may not mean anything; only their order matters. Group 1: 5, 3, 1 average 3 Group 2: 4, 4, 2 average 3.33 Conclude: average loudness of group 2 machines is higher.

44 44 Average Loudness Suppose f(a) is measured on an ordinal scale, e.g., 5- point scale: 5=extremely loud, 4=very loud, 3=loud, 2=slightly loud, 1=quiet. In such a scale, the numbers may not mean anything; only their order matters. Group 1: 5, 3, 1 average 3 Group 2: 4, 4, 2 average 3.33 (greater) Admissible transformation: 5  100, 4  75, 3  65, 2  40, 1  30 New scale conveys the same information. New scores: Group 1: 100, 65, 30 average 65 Group 2: 75, 75, 40 average 63.33 Conclude: average loudness of group 1 machines is higher..

45 45 Average Loudness Thus, comparison of arithmetic means can be meaningless for ordinal data. Of course, you may argue that in the 5-point scale, at least equal spacing between scale values is an inherent property of the scale. In that case, the scale is not ordinal and this example does not apply. Note: Comparing medians is meaningful with ordinal scales: To say that one group has a higher median than another group is preserved under admissible transformations..

46 46 Average Loudness Suppose each of n individuals is asked to rate each of a collection of alternative machines as to their relative loudness. Or we rate alternatives on different criteria or against different benchmarks. (Similar results with performance ratings, importance ratings, etc.) Let f i (a) be the rating of alternative a by individual i (under criterion i). Is it meaningful to assert that the average rating of alternative a is higher than the average rating of alternative b?

47 47 Average Loudness Let f i (a) be the rating of alternative a by individual i (under criterion i). Is it meaningful to assert that the average rating of alternative a is higher than the average rating of alternative b? A similar question arises in performance ratings, brightness ratings, confidence ratings, etc. (1) 1 n n X i = 1 f i ( a ) > 1 n n X i = 1 f i ( b )

48 48 Average Loudness If each f i is a ratio scale, then we consider for  > 0, (2) Clearly, (1)  (2), so (1) is meaningful. Problem: f 1, f 2, …, f n might have independent units. In this case, we want to allow independent admissible transformations of the f i. Thus, we must consider (3) It is easy to see that there are  i so that (1) holds and (3) fails. Thus, (1) is meaningless.. 1 n n X i = 1 ® f i ( a ) > 1 n n X i = 1 ® f i ( b ) 1 n n X i = 1 ® i f i ( a ) > 1 n n X i = 1 ® i f i ( b )

49 49 Average Loudness Motivation for considering different  i : n = 2, f 1 (a) = weight of a, f 2 (a) = height of a. Then (1) says that the average of a's weight and height is greater than the average of b's weight and height. This could be true with one combination of weight and height scales and false with another.

50 50 Average Loudness Motivation for considering different  i : n = 2, f 1 (a) = weight of a, f 2 (a) = height of a. Then (1) says that the average of a's weight and height is greater than the average of b's weight and height. This could be true with one combination of weight and height scales and false with another. Conclusion: Be careful when comparing arithmetic mean ratings.

51 51 Average Loudness In this context, it is safer to compare geometric means (Dalkey). all  i > 0. Thus, if each f i is a ratio scale, if individuals can change loudness rating scales (performance rating scales, importance rating scales) independently, then comparison of geometric means is meaningful while comparison of arithmetic means is not.. n p ¦ n i = 1 f i ( a ) > n p ¦ n i = 1 f i ( b ) $ n p ¦ n i = 1 ® i f i ( a ) > n p ¦ n i = 1 ® i f i ( b )

52 52 Application of this Idea In a study of air pollution and related energy use in San Diego, a panel of experts each estimated the relative importance of variables relevant to energy demand using the magnitude estimation procedure. Roberts (1972, 1973). If magnitude estimation leads to a ratio scale -- Stevens presumes this -- then comparison of geometric mean importance ratings is meaningful. However, comparison of arithmetic means may not be. Geometric means were used..

53 53 Magnitude Estimation by One Expert of Relative Importance for Energy Demand of Variables Related to Commuter Bus Transportation in a Given Region VariableRel. Import. Rating 1. No. bus passenger mi. annually 80 2. No. trips annually100 3. No. miles of bus routes 50 4. No. miles special bus lanes 50 5. Average time home to office 70 6. Average distance home to office 65 7. Average speed 10 8. Average no. passengers per bus 20 9. Distance to bus stop from home 50 10. No. buses in the region 20 11. No. stops, home to office 20.

54 54 Outline 1.Introduction to Measurement Theory 2.Theory of Uniqueness of Scales of Measurement/Scale Types 3.Meaningful Statements 4.Averaging Judgments of Loudness 5.Measurement of Air Pollution: A Combined Pollution Index 6.Evaluation of Water Testing Equipment: “Merging Normalized Scores” 7.Measurement of Noise: Introduction to Psychophysical Scaling 8.Excursis: Functional Equations 9.The Possible Psychophysical Laws/The Power Law 10.How to Average Scores

55 55 MEASUREMENT OF AIR POLLUTION

56 56 MEASUREMENT OF AIR POLLUTION Various pollutants are present in the air: Carbon monoxide (CO), hydrocarbons (HC), nitrogen oxides (NOX), sulfur oxides (SOX), particulate matter (PM). Also damaging: Products of chemical reactions among pollutants. E.g.: Oxidants such as ozone produced by HC and NOX reacting in presence of sunlight. Some pollutants are more serious in presence of others, e.g., SOX are more harmful in presence of PM. Can we measure pollution with one overall measure?

57 57 To compare pollution control policies, need to compare effects of different pollutants. We might allow increase of some pollutants to achieve decrease of others. One single measure could give indication of how bad pollution level is and might help us determine if we have made progress. Combining Weight of Pollutants: Measure total weight of emissions of pollutant i over fixed period of time and sum over i. e(i,t,k) = total weight of emissions of pollutant i (per cubic meter) over tth time period and due to kth source or measured in kth location. MEASUREMENT OF AIR POLLUTION A ( t ; k ) = X i e ( i ; t ; k )

58 58 Early uses of this simple index A in the early 1970s led to the conclusions: (A) Transportation is the largest source of air pollution, with stationary fuel combustion (especially by electric power plants) second largest. (B) Transportation accounts for over 50% of all air pollution. (C) CO accounts for over half of all emitted air pollution. Are these meaningful conclusions? MEASUREMENT OF AIR POLLUTION

59 59 Early uses of this simple index A in the early 1970s led to the conclusions: (A) Transportation is the largest source of air pollution, with stationary fuel combustion (especially by electric power plants) second largest. Are these meaningful conclusions? MEASUREMENT OF AIR POLLUTION A ( t ; k ) > A ( t ; k 0 )

60 60 Early uses of this simple index A in the early 1970s led to the conclusions: (B) Transportation accounts for over 50% of all air pollution. Are these meaningful conclusions? MEASUREMENT OF AIR POLLUTION A ( t ; k r ) > X k6 = k r A ( t ; k )

61 61 Early uses of this simple index A in the early 1970s led to the conclusions: (C) CO accounts for over half of all emitted air pollution. Are these meaningful conclusions? MEASUREMENT OF AIR POLLUTION X t ; k e ( i ; t ; k ) > X t ; k X j 6 = i e ( j ; t ; k )

62 62 All these conclusions are meaningful if we measure all e(i,t,k) in same units of mass (e.g., milligrams per cubic meter) and so admissible transformation means multiply e(i,t,k) by same constant. MEASUREMENT OF AIR POLLUTION A ( t ; k ) > A ( t ; k 0 ) A ( t ; k r ) > X k6 = k r A ( t ; k ) X t ; k e ( i ; t ; k ) > X t ; k X j 6 = i e ( j ; t ; k )

63 63 These comparisons are meaningful in the technical sense. But: Are they meaningful comparisons of pollution level in a practical sense? A unit of mass of CO is far less harmful than a unit of mass of NOX. EPA standards based on health effects for 24 hour period allow 7800 units of CO to 330 units of NOX. These are Minimum acute toxicity effluent tolerance factors (MATE criteria). Tolerance factor is level at which adverse effects are known. Let  (i) be tolerance factor for ith pollutant. Severity factor:  (CO)/  (i) or 1/  (i) MEASUREMENT OF AIR POLLUTION

64 64 One idea (Babcock and Nagda, Walther, Caretto and Sawyer): Weight the emission levels (in mass) by severity factor and get a weighted sum. This amounts to using the indices Degree of hazard: and the combined index Pindex: Under pindex, transportation is still the largest source of pollutants, but now accounting for less than 50%. Stationary sources fall to fourth place. CO drops to bottom of list of pollutants, accounting for just over 2% of the total. MEASUREMENT OF AIR POLLUTION 1 ¿ ( i ) e ( i ; t ; k ) B ( t ; k ) = P i 1 ¿ ( i ) e ( i ; t ; k )

65 65 These conclusions are again meaningful if all emission weights are measured in the same units. For an admissible transformation multiplies  and e by the same constant and thus leaves the degree of hazard unchanged and pindex unchanged. Pindex was introduced in the San Francisco Bay Area in the 1960s. But, are comparisons using pindex meaningful in the practical sense? MEASUREMENT OF AIR POLLUTION

66 66 Pindex amounts to: For a given pollutant, take the percentage of a given harmful level of emissions that is reached in a given period of time, and add up these percentages over all pollutants. (Sum can be greater than 100% as a result.) If 100% of the CO tolerance level is reached, this is known to have some damaging effects. Pindex implies that the effects are equally severe if levels of five major pollutants are relatively low, say 20% of their known harmful levels. MEASUREMENT OF AIR POLLUTION

67 67 Severity tonnage of pollutant i due to a given source is actual tonnage times the severity factor 1/  (i). In early air pollution measurement literature, severity tonnage was considered a measure of how severe pollution due to a source was. Data from Walther 1972 suggests the following. Interesting exercise to decide which of these conclusions are meaningful. MEASUREMENT OF AIR POLLUTION

68 68 1. HC emissions are more severe (have greater severity tonnage) than NOX emissions. 2. Effects of HC emissions from transportation are more severe than those of HC emissions from industry. (Same for NOX.). 3. Effects of HC emissions from transportation are more severe than those of CO emissions from industry. 4. Effects of HC emissions from transportation are more than 20 times as severe as effects of CO emissions from transportation. 5. The total effect of HC emissions due to all sources is more than 8 times as severe as total effect of NOX emissions due to all sources. MEASUREMENT OF AIR POLLUTION

69 69 Outline 1.Introduction to Measurement Theory 2.Theory of Uniqueness of Scales of Measurement/Scale Types 3.Meaningful Statements 4.Averaging Judgments of Loudness 5.Measurement of Air Pollution: A Combined Pollution Index 6.Evaluation of Water Testing Equipment: “Merging Normalized Scores” 7.Measurement of Noise: Introduction to Psychophysical Scaling 8.Excursis: Functional Equations 9.The Possible Psychophysical Laws/The Power Law 10.How to Average Scores

70 70 Evaluation of Water Testing Equipment How do we evaluate alternative possible water testing systems? Or pollution control systems for oil, chemicals, … A number of systems are tested on different benchmarks. Their scores on each benchmark are normalized relative to the score of one of the systems. The normalized scores of a system are combined by some averaging procedure. If the averaging is the arithmetic mean, then the statement “one system has a higher arithmetic mean normalized score than another system” is meaningless: The system to which scores are normalized can determine which has the higher arithmetic mean..

71 71 Evaluation of Water Testing Equipment Similar methods are used in comparing performance of alternative computer systems or other types of machinery. The following example has numbers taken out of the computer science literature from an article comparing computer systems. However, the same applies to pollution control equipment or other types of equipment. The example is due to Fleming and Wallace (1986).

72 72 Equipment Evaluation Evaluation of Water Testing Equipment 417836639,449772 2447015333,527368 1347013566,000369 BENCHMARK R M Z SYSTEMSYSTEM EFGHI Data from Heath, Comput. Archit. News (1984)

73 73 Equipment Evaluation Normalize Relative to System R 417 1.00 83 1.00 66 1.00 39,449 1.00 772 1.00 244.59 70.84 153 2.32 33,527.85 368.48 134.32 70.85 135 2.05 66,000 1.67 369.45 BENCHMARK R M Z SYSTEMSYSTEM EFGHI

74 74 Equipment Evaluation Take Arithmetic Mean of Normalized Scores 417 1.00 83 1.00 66 1.00 39,449 1.00 772 1.00 244.59 70.84 153 2.32 33,527.85 368.48 134.32 70.85 135 2.05 66,000 1.67 369.45 BENCHMARK R M Z SYSTEMSYSTEM EFGHI Arithmetic Mean 1.00 1.01 1.07

75 75 Equipment Evaluation Take Arithmetic Mean of Normalized Scores 417 1.00 83 1.00 66 1.00 39,449 1.00 772 1.00 244.59 70.84 153 2.32 33,527.85 368.48 134.32 70.85 135 2.05 66,000 1.67 369.45 BENCHMARK R M Z SYSTEMSYSTEM EFGHI Arithmetic Mean 1.00 1.01 1.07 Conclude that system Z is best

76 76 Equipment Evaluation Now Normalize Relative to System M 417 1.71 83 1.19 66.43 39,449 1.18 772 2.10 244 1.00 70 1.00 153 1.00 33,527 1.00 368 1.00 134.55 70 1.00 135.88 66,000 1.97 369 1.00 BENCHMARK R M Z SYSTEMSYSTEM EFGHI

77 77 Equipment Evaluation Take Arithmetic Mean of Normalized Scores 417 1.71 83 1.19 66.43 39,449 1.18 772 2.10 244 1.00 70 1.00 153 1.00 33,527 1.00 368 1.00 134.55 70 1.00 135.88 66,000 1.97 369 1.00 BENCHMARK R M Z SYSTEMSYSTEM EFGHI Arithmetic Mean 1.32 1.00 1.08

78 78 Equipment Evaluation Take Arithmetic Mean of Normalized Scores 417 1.71 83 1.19 66.43 39,449 1.18 772 2.10 244 1.00 70 1.00 153 1.00 33,527 1.00 368 1.00 134.55 70 1.00 135.88 66,000 1.97 369 1.00 BENCHMARK R M Z SYSTEMSYSTEM EFGHI Arithmetic Mean 1.32 1.00 1.08 Conclude that system R is best

79 79 Equipment Evaluation So, the conclusion that a given system is best by taking arithmetic mean of normalized scores is meaningless in this case. Above example from Fleming and Wallace (1986), data from Heath (1984) Sometimes, geometric mean is helpful. Geometric mean is   i s(x i ) n 

80 80 Equipment Evaluation Normalize Relative to System R 417 1.00 83 1.00 66 1.00 39,449 1.00 772 1.00 244.59 70.84 153 2.32 33,527.85 368.48 134.32 70.85 135 2.05 66,000 1.67 369.45 BENCHMARK R M Z SYSTEMSYSTEM EFGHI Geometric Mean 1.00.86.84 Conclude that system R is best

81 81 Equipment Evaluation Now Normalize Relative to System M 417 1.71 83 1.19 66.43 39,449 1.18 772 2.10 244 1.00 70 1.00 153 1.00 33,527 1.00 368 1.00 134.55 70 1.00 135.88 66,000 1.97 369 1.00 BENCHMARK R M Z SYSTEMSYSTEM EFGHI Geometric Mean 1.17 1.00.99 Still conclude that system R is best

82 82 Equipment Evaluation In this situation, it is easy to show that the conclusion that a given system has highest geometric mean normalized score is a meaningful conclusion. Even meaningful: A given system has geometric mean normalized score 20% higher than another machine. Fleming and Wallace give general conditions under which comparing geometric means of normalized scores is meaningful. Research area: what averaging procedures make sense in what situations? Large literature.

83 83 Equipment Evaluation Message from measurement theory: Do not perform arithmetic operations on data without paying attention to whether the conclusions you get are meaningful.

84 84 Equipment Evaluation We have seen that in some situations, comparing arithmetic means is not a good idea and comparing geometric means is. We will see that there are situations where the reverse is true. Can we lay down some guidelines as to when to use what averaging procedure? More on this later.

85 85 Outline 1.Introduction to Measurement Theory 2.Theory of Uniqueness of Scales of Measurement/Scale Types 3.Meaningful Statements 4.Averaging Judgments of Loudness 5.Measurement of Air Pollution: A Combined Pollution Index 6.Evaluation of Water Testing Equipment: “Merging Normalized Scores” 7.Measurement of Noise: Introduction to Psychophysical Scaling 8.Excursis: Functional Equations 9.The Possible Psychophysical Laws/The Power Law 10.How to Average Scores

86 86 Measurement of Noise A sound has physical characteristics: –Intensity (energy transported) –Frequency (in cycles per second) –Duration A sound also has psychological characteristics: –How loud does it seem? –What emotional meaning does it portray? –What images does it suggest?

87 87 Measurement of Noise Since middle of 19 th century, scientists have tried to study the relationships between physical characteristics of stimuli like sounds and their psychological characteristics. Psychophysics is the discipline that studies psychological sensations such as loudness, brightness, apparent length, apparent duration, and their relations to physical stimuli. Not all psychological characteristics have clear relationships to physical ones. E.g., emotional meaning. However, some seem to. We will concentrate on loudness.

88 88 Measurement of Noise Loudness of a sound is different from its disturbing effect. This disturbing effect is called noise. We will concentrate on loudness and equate noise with loudness. Noise has more than just disturbing effects. It has physiological effects too: Affects hearing Affects cardiovascular system May be related to stomach problems May even be related to infertility

89 89 Measurement of Noise Subjective judgments of loudness depend on physical intensity and frequency. Equal loudness contour: Duration of a sound may also enter into loudness.

90 90 Measurement of Noise To simplify matters, one tries to eliminate all physical factors but one. Deal with pure tones, sounds of constant intensity at a fixed frequency. Present them for fixed duration of time. Let I(a) denote the intensity of pure tone a. I(a) is proportional to the root-mean-square pressure p(a). The common unit of measurement of intensity is the decibel (dB). This is 10 log 10 (I/I 0 ), where I 0 is a reference sound. A sound of 1 dB is essentially the lowest audible sound.

91 91 Measurement of Noise Some Sample Decibel Levels: Uncomfortably Loud: Oxygen torch (121 dB) Snowmobile (113 dB) Riveting machine (110 dB) Rock band (108-114 dB) Jet takeoff at 1000 ft. (110 dB) Jet flyover at 1000 ft. (103 dB)

92 92 Measurement of Noise Some Sample Decibel Levels: Very Loud: Electric furnace (100 dB) Power mower (96 dB) Rock drill at 50 ft. (95 dB) Motorcycle at 50 ft. (90 dB) Smowmobile at 50 ft. (90 dB) Food blender (88 dB)

93 93 Measurement of Noise Some Sample Decibel Levels: Moderately Loud: Power mower at 50 ft. (85 dB) Diesel truck at 50 ft. (85 dB) Diesel train at 50 ft. (85dB) Garbage disposal (80 dB) Washing machine (78 dB) Dishwasher (75 dB) Passenger car at 50 ft. (75 dB) Air conditioning unit at 50 ft. (60 dB)

94 94 Measurement of Noise Loudness of a sound a = L(a). Unit of measurement of loudness = the sone (1 sone = loudness of 1000 cps pure tone at 40 dB) Loudness is a psychological scale. What is the relation between L(a) and the physical intensity of a, I(a)? This relation usually called the psychophysical law. L(a) =  (I(a))  is called the psychophysical function. A basic goal of psychophysics is to find the general form of the psychophysical function that applies in many cases.

95 95 Measurement of Noise Fechner’s Law First attempt to specify  for large class of psychological variables: Gustav Fechner (1860). Fechner argued that:  (x) = c log x + k c, k constant. This is called Fechner’s Law For loudness: L(a) = c log I(a) + k Decibel scale a special case: Base of log is 10, c = 10, k = -10 log 10 I 0.

96 96 Measurement of Noise If L(a) = dB(a), then a doubling of the dB level of a sound should lead to a doubling of the perceived loudness. So, 100 dB should sound twice as loud as 50 dB. Is this the case?

97 97 Measurement of Noise Fletcher and Munson (1933) Assumption: loudness proportional to number of auditory nerve impulses reaching the brain Thus: sound delivered to 2 ears should appear twice as loud as same sound delivered to 1 ear. F & M found that to sound equally loud, a pure tone delivered to 1 ear had to be about 10 dB higher than if it were delivered to 2 ears. They concluded: subjective loudness doubles for each 10 dB increase in pressure. Thus, an increase from 50 dB to 60 dB doubles loudness, not an increase from 50 dB to 100 dB.

98 98 Measurement of Noise Fletcher and Munson (1933) This shows that dB does not measure loudness. Much data supports the F & M results. This implies that Fechner’s Law (*) L(a) = c log I(a) + k can’t hold. Proof: Assume (*). We have c  0 since L(a) is not constant. We may use log to base 10 since if any other base, incorporate switch to base 10 into c

99 99 Measurement of Noise Fletcher and Munson (1933) Then: L(a) = c log 10 I(a) + k = c log10 [I(a)/I 0 ] + (k + c log 10 I 0 ) = c log10 [I(a)/I 0 ] + k* = c dB(a) + k*. Let dB(b) = dB(a) + 10. Then by F&M data, L(a) = 2L(b). Thus c dB(b) + k* = 2[c dB(a) + k*] c [dB(a) + 10] + k* = 2[c dB(a) + k*]. Since c  0, dB(a) = [10c - k*]/c. dB(a) is constant. A contradiction.

100 100 Measurement of Noise The Power Law S.S. Stevens (many papers): The fundamental psychophysical law for many psychological and physical variables is a power law:  (x) = cx k Many experiments have estimated that for loudness and intensity, the exponent k is approximately 0.3. Is this consistent with the Fletcher-Munson observation?

101 101 Measurement of Noise The Power Law Note that 0.3  log 10 2 Power law: L(x) = cI(x) dB(x) = 10 log 10 [I(x)/I 0 ] I(x) = I 0 10 (1/10)dB(x) (**)L(x) = cI 0 [10 (1/10)dB(x) ] If dB(b) = dB(a) + 10, then letting x = b in lhs of (**) and taking dB(x) = dB(a) + 10 in rhs, we find that L(b) = 2L(a) l og 10 2 l og 10 2 l og 10 2

102 102 Measurement of Noise The Power Law Data seems suggests that the power law holds for more than 2 dozen variables (at least approximately and in limited intervals), including: Brightness (in brils) Smell Taste (in gusts) Judged temperature Judged duration Presure on palm Judged heaviness Force of handgrip Vocal effort

103 103 Measurement of Noise The Power Law Power law fairly widely accepted for certain psychological/physical variables. It fails for things like pitch as function of frequency, apparent inclination, etc. We will discuss a theory that allows us to derive the power law.

104 104 Outline 1.Introduction to Measurement Theory 2.Theory of Uniqueness of Scales of Measurement/Scale Types 3.Meaningful Statements 4.Averaging Judgments of Loudness 5.Measurement of Air Pollution: A Combined Pollution Index 6.Evaluation of Water Testing Equipment: “Merging Normalized Scores” 7.Measurement of Noise: Introduction to Psychophysical Scaling 8.Excursis: Functional Equations 9.The Possible Psychophysical Laws/The Power Law 10.How to Average Scores

105 105 Excursis: Functional Equations Consider the pyschophysical law: g(x) =  (f(x)) where g(x) is psychological scale, f(x) physical scale. May not know the psychophysical function , but may know some of its properties. For instance, we may discover that  (x+y) =  (x) +  (y) This is a functional equation. Making certain assumptions about domain, range, and continuity properties of unknown function  sometimes allows us to solve such equations.

106 106 Excursis: The Cauchy Equations The equation  (x+y) =  (x) +  (y) is the first of four basic functional equations known as the Cauchy equations (after Augustin Cauchy).

107 107 Excursis: The Cauchy Equations The equation  (x+y) =  (x) +  (y) is the first of four basic functional equations known as the Cauchy equations (after Augustin Cauchy). The Other Cauchy Equations:  (x+y) =  (x)  (y)  (xy) =  (x) +  (y)  (xy) =  (x)  (y)

108 108 Excursis: The Cauchy Equations Theorem: Suppose  :  or  +  is such that  (x+y) =  (x) +  (y) for all x,y, and suppose that  is continuous. Then There is a real number c such that  (x) = cx for all x. Proof: 1.By induction,  (nx) = n  (x) for all positive integers n. 2.Let x = (m/n)t for m,n positive integers, t > 0 real. Then nx = mt, so n  (x) = m  (t). Thus,  ((m/n)t) = (m/n)  (t)

109 109 Excursis: The Cauchy Equations 3. Let c =  (1). Taking t = 1 gives us  (m/n) = c(m/n) 4. Thus,  (r) = cr for all positive rationals r. 5. Given positive real x, let r 1, r 2, …, r n …  x, r i rational 6.Then  (r n ) = cr n implies  (x) = cx. 7.This completes proof if domain is  +. 8.If domain is , then note that  (0) = 0 = c0 follows from the first Cauchy equation. 9.If x < 0, then  (x) =  (0) -  (-x) = -  (-x) = -c(-x) = cx. Q.E.D.

110 110 Excursis: The Cauchy Equations Note: The theorem holds if  is assumed continuous at only one point x* in its domain (Darboux, 1875). To see why, we show that this assumption implies that  is actually continuous over all its domain. We present the proof when the domain is . By continuity, lim t  x*  (t) =  (x*). For any other x, lim u  x  (u) = lim u-x+x*  x*  [(u-x+x*) + (x-x*)] = lim t  x*  [t + (x-x*)] = lim t  x* [  (t) +  (x-x*)] =  (x*) +  (x-x*) =  (x* + x-x*) =  (x).

111 111 Excursis: The Cauchy Equations We have similar results for the other three Cauchy equations. (And in each case, continuity may be replaced by continuity at a point or some other other assumption like monotonicity, or boundedness on an (arbitrarily small, open) n-dimensional interval or on a set of positive measure.) Theorem: Suppose  :  or  +  is such that  (x+y) =  (x)  (y) for all x,y, and suppose that  is continuous. Then either  (x) = 0 for all x or  (x) = e cx for a real constant c.

112 112 Excursis: The Cauchy Equations Theorem: Suppose  :  +  is such that  (xy) =  (x) +  (y) for all x,y, and suppose that  is continuous. Then  (x) = c ln x for a real constant c. Theorem: Suppose  :  is such that  (xy) =  (x) +  (y) for all x,y, and suppose that  is continuous. Then  (x) = 0 for all x.

113 113 Excursis: The Cauchy Equations Theorem: Suppose  :  +  is such that  (xy) =  (x)  (y) for all x,y, and suppose that  is continuous. Then either  (x) = 0 for all x or  (x) = x c for a real constant c.

114 114 Excursis: The Cauchy Equations Theorem: Suppose  :  is such that  (xy) =  (x)  (y) for all x,y, and suppose that  is continuous. Then either (a)  (x) = 0 for all x (b)  (x) = x c, x > 0,  (x) = (-x) c, x < 0,  (0) = 0 for a real constant c. (c)  (x) = x c, x > 0,  (x) = -(-x) c, x < 0,  (0) = 0 for a real constant c

115 115 Outline 1.Introduction to Measurement Theory 2.Theory of Uniqueness of Scales of Measurement/Scale Types 3.Meaningful Statements 4.Averaging Judgments of Loudness 5.Measurement of Air Pollution: A Combined Pollution Index 6.Evaluation of Water Testing Equipment: “Merging Normalized Scores” 7.Measurement of Noise: Introduction to Psychophysical Scaling 8.Excursis: Functional Equations 9.The Possible Psychophysical Laws/The Power Law 10.How to Average Scores

116 116 Consider the psychophysical law: g(x) =  (f(x)) Luce's idea (“Principle of Theory Construction”): If you know the scale type of f and g and you assume that an admissible transformation of f leads to an admissible transformation of g, you can derive the form of . The Possible Psychophysical Laws R. Duncan Luce

117 117 Consider the psychophysical law: g(x) =  (f(x)) Example 1: Suppose f and g are both ratio scales. Admissible transformations of scale: multiplication by a positive constant. Multiplying the independent variable by a positive constant  leads to multiplying the dependent variable by a positive constant A that depends on . This leads to the functional equation: (&)  (  x) = A(  )  (x), A(  ) > 0. The Possible Psychophysical Laws

118 118 This leads to the functional equation: (&)  (  x) = A(  )  (x),  > 0, A(  ) > 0. By solving this functional equation, Luce proved the following theorem: Theorem (Luce 1959): Suppose the psychophysical function  is continuous and suppose f takes on all positive real values and g takes on positive real values. Then  (x) = cx k Thus, if both the independent and dependent variables are ratio scales, the only possible psychophysical law is the power law. The Possible Psychophysical Laws

119 119 Aside: Solution to the functional equation: (&)  (  x) = A(  )  (x),  > 0, A(  ) > 0. 1.Since g takes on positive values,  (1) > 0. 2.By (&),  (  ) = A(  )  (1), so A(  ) =  (  )/  (1). 3.Hence,  (  x) =  (  )  (x)/  (1). 4. Let  (x) = ln[  (x)/  (1)]. 5. Then  (  x) = ln[  (  x)/  (1)] = ln[  (  )  (x)/  (1)  (1)] = ln[  (  )/  (1)] +ln[  (x)/  (1)] =  (  ) +  (x) 6. Thus,  satisfies the third Cauchy equation. The Possible Psychophysical Laws

120 120 Aside: Solution to the functional equation: (&)  (  x) = A(  )  (x),  > 0, A(  ) > 0. 4. Let  (x) = ln[  (x)/  (1)]. 6. Thus,  satisfies the third Cauchy equation. 7. By our earlier results,  (x) = k ln x = ln x k 8. Thus,  (x) =  (1)e  (x) = cx k The Possible Psychophysical Laws

121 121 Example 2: f is a ratio scale, g is an interval scale. Functional Equation:  (  x) = A(  )  (x) + B(  ),  > 0, A(  ) > 0. Example 3: f is an interval scale, g is a ratio scale. Functional Equation:  (  x+  ) = A( ,  )  (x),  > 0, A( ,  ) > 0. Example 4: f is an interval scale, g is an interval scale. Functional Equation:  (  x+  ) = A( ,  )  (x) + B( ,  ),  > 0, A( ,  ) > 0. The Possible Psychophysical Laws

122 122 We present Luce’s solutions to these functional equations using the hypothesis that  is continuous, that f takes on all positive real values if it is a ratio scale and all real values if it is an interval scale, and g takes on positive real values if it is a ratio scale and takes on real values if it is an interval scale. Later, in generalizing these theorems, we will note that continuity is too strong an assumption and can be replaced by any of a much weaker class of assumptions. The Possible Psychophysical Laws

123 123 Example 2: f is a ratio scale, g is an interval scale. Functional Equation:  (  x) = A(  )  (x) + B(  ),  > 0, A(  ) > 0. Luce’s Solution:  (x) = c ln x + k or  (x) = cx k + q The Possible Psychophysical Laws

124 124 Example 3: f is an interval scale, g is a ratio scale. Functional Equation:  (  x+  ) = A( ,  )  (x),  > 0, A( ,  ) > 0. Luce’s Solution:  is constant The Possible Psychophysical Laws

125 125 Example 4: f is an interval scale, g is an interval scale. Functional Equation:  (  x+  ) = A( ,  )  (x) + B( ,  ),  > 0, A( ,  ) > 0. Luce’s Solution:  (x) = cx + k The Possible Psychophysical Laws

126 126 The results we have derived are very general. While we applied them to psychophysical scaling, one can interpret them as limiting in very strict ways the “possible scientific laws” Other examples of power laws: –V = (4/3)  r 3 Volume V, radius r are ratio scales –Newton’s Law of gravitation: F = G(mm*/r 2 ), where F is force of attraction, G is gravitational constant, m,m* are fixed masses of bodies being attracted, r is distance between them. –Ohm’s Law: Under fixed resistance, voltage is proportional to current (voltage, current are ratio scales) The Possible Scientific Laws

127 127 If a body of constant mass is moving at a velocity v, then its energy is cv 2 + d, c, d constants. – Energy is an interval scale and velocity a ratio scale – This illustrates Luce’s second functional equation. If the temperature of a perfect gas is constant, then as a function of pressure p, the entropy of the gas is c log p + d, c, d constant. – Pressure defines a ratio scale and entropy an interval scale – This illustrates Luce’s second functional equation. The Possible Scientific Laws

128 128 We showed that if the physical and psychological variables define ratio scales, then under some reasonable assumptions (like continuity), they are related by a power law. To apply this in practice, one has to ask: How do you know the scale type? The most sure-fire way is to prove a representation theorem and a uniqueness theorem. Recall: A representation theorem gives necessary (and sufficient) conditions for there to be a homomorphism from one relational system into another. A uniqueness theorem defines the class of admissible transformations for a homomorphism from into. The Power Law Revisited AAB B

129 129 Example of a representation theorem. Consider homomorphisms from (A,R,  ) into ( ,>,+). f: A   aRb  f(a) > f(b) f(a  b) = f(a) + f(b). Representation Theorem (Hölder 1901): Every Archimedean ordered group (A,R,  ) is homomorphic to ( ,>,+). The Power Law Revisited

130 130 Axioms for an Archimedean Ordered Group (A,R,  ) (1). (A,  ) is a group. (2). (A,R) is a (strict) simple (linear) order. ((A,R) is asymmetric, transitive, and complete.) (3). Monotonicity: aRb  (a  c)R(b  c)  (c  a)R(c  b) (4). Archimedean: If e is the group identity and aRe, then there is a positive integer n such that naRb. (Here, na is a  a  a  …  a n times.) Comment: The Archimedean axiom is what makes comparisons possible. The Power Law Revisited

131 131 Axioms for an Archimedean Ordered Group (A,R,  ) Which of the axioms are necessary? (1). Associativity, identity, inverse are not. (2). Strict simple order is not; there could be ties. (3). Monotonicity is. (4). The Archimedean axiom as stated is not; it doesn't even make sense without e. Roberts and Luce (1968) gave conditions necessary and sufficient for a homomorphism from (A,R,  ) into ( ,>,+). The Power Law Revisited

132 132 Measurement of Mass Hölder’s theorem applies to measurement of mass. R = “heavier than”,  is combination. The axioms for a homomorphism are satisfied (in an idealized world). The Power Law Revisited

133 133 Uniqueness Theorem: Uniqueness Theorem (Roberts and Luce, 1968): Every homomorphism from (A,R,  ) into ( ,>,+) is regular and defines a ratio scale. Thus, measurement of mass defines a ratio scale. The Power Law Revisited

134 134 Measurement of Loudness For measurement of loudness, we can think of R as the binary relation “louder than” but what about the operation  ? The axioms even for R are often violated in practice. E.g., there are a, b so that neither aRb nor bRa. (This violates completness) E.g., transitivity of “equally loud” is violated. The Power Law Revisited

135 135 Semiorders Suppose R is “sounds louder than”. Suppose (A,R) is a strict simple order. Then “sounds equally loud” corresponds to the relation: aIb  ~aRb & ~bRa It is easy to see that if (A,I) is transitive. (This follows by the conditions for a strict simple order, which by definition is asymmetric, transitive, and complete.) (Also follows if we assume that aRb  f(a) > f(b).) The Power Law Revisited

136 136 Arguments against transitivity of “equally loud” (or more generally of “tying” in a measurement context) go back to Armstrong in the 1930's, Wiener in the 1920's, even Poincaré (in the 19 th century). One argument against transitivity: Let 2* = 2 dB pure tone, 12* = 12 dB pure tone. The second sounds louder than the first. They do not sound equally loud..  2*12* The Power Law Revisited

137 137 Increase the loudness.001 dB at a time. For any two successive sounds, they sound “equally loud.” Then transitivity would imply that the first and last sound equally loud.  2*12* The Power Law Revisited

138 138 So, how does one conclude that the sone scale is a ratio scale – something one needs to derive the power law for loudness? The Power Law Revisited

139 139 So, how does one conclude that the sone scale is a ratio scale – something one needs to derive the power law? Stevens’ arguments (one of many): Because subjects can do “magnitude estimation,” and are comfortable with ratios of sounds (a sounds twice as loud as b). Some attempt to put magnitude estimation procedure on firm axiomatic foundation. But not totally satisfactory. One can also try to replace “scale type” assumption with “meaningfulness” assumption – as we will see next. The Power Law Revisited

140 140 Outline 1.Introduction to Measurement Theory 2.Theory of Uniqueness of Scales of Measurement/Scale Types 3.Meaningful Statements 4.Averaging Judgments of Loudness 5.Measurement of Air Pollution: A Combined Pollution Index 6.Evaluation of Water Testing Equipment: “Merging Normalized Scores” 7.Measurement of Noise: Introduction to Psychophysical Scaling 8.Excursis: Functional Equations 9.The Possible Psychophysical Laws/The Power Law 10.How to Average Scores

141 141 Sometimes arithmetic means are not a good idea. Sometimes geometric means are. Are there situations where the opposite is the case? Or some other method is better? Can we lay down some guidelines about when to use what averaging or merging procedure? Methods we have described will help. Let a 1, a 2, …, a n be “scores” or ratings, e.g., scores on benchmarks for water or air pollution equipment, loudness ratings, etc. Let u = F(a 1,a 2, …, a n ) F is an unknown averaging function – sometimes called a merging function, and u is the average or merged score. How Should We Average Scores?

142 142 Approaches to finding acceptable merging functions: (1) axiomatic (2) scale types (3) meaningfulness How Should We Average Scores?

143 143 An Axiomatic Approach Theorem (Fleming and Wallace). Suppose F:(  + ) n   + has the following properties: (1). Reflexivity: F(a,a,...,a) = a (2). Symmetry: F(a 1,a 2,…,a n ) = F(a  (1),a  (2),…,a  (n) ) for all permutations  of {1,2,…,n} (3). Multiplicativity: F(a 1 b 1,a 2 b 2,…,a n b n ) = F(a 1,a 2,…,a n ) F(b 1,b 2,…,b n ) Then F is the geometric mean. And conversely. How Should We Average Scores?

144 144 A Functional Equations Approach Using Scale Type or Meaningfulness Assumptions Unknown function u = F(a 1,a 2,…,a n ) Recall: Luce's idea (“Principle of Theory Construction”): If you know the scale types of the a i and the scale type of u and you assume that an admissible transformation of each of the a i leads to an admissible transformation of u, you can derive the form of F. How Should We Average Scores?

145 145 A Functional Equations Approach Example: a 1, a 2, …, a n are independent ratio scales, u is a ratio scale. F: (  + ) n   + F(a 1,a 2,…,a n ) = u  F(  1 a 1,  2 a 2,…,  n a n ) =  u,  1 > 0,  2 > 0,  n > 0,  > 0,  depends on a 1, a 2, …, a n. Thus we get the functional equation: (*) F(  1 a 1,  2 a 2,…,  n a n ) = A(  1,  2,…,  n )F(a 1,a 2,…,a n ), A(  1,  2,…,  n ) > 0 How Should We Average Scores?

146 146 A Functional Equations Approach (*) F(  1 a 1,  2 a 2,…,  n a n ) = A(  1,  2,…,  n )F(a 1,a 2,…,a n ), A(  1,  2,…,  n ) > 0 Theorem (Luce 1964): If F: (  + ) n   + is continuous and satisfies (*), then there are > 0, c 1, c 2, …, c n so that How Should We Average Scores? F ( a 1 ; a 2 ;:::; a n ) = ¸ a c 1 1 a c 2 2 ::: a c n n :

147 147 (Aczél, Roberts, Rosenbaum 1986): It is easy to see that the assumption of continuity can be weakened to continuity at a point, monotonicity, or boundedness on an (arbitrarily small, open) n-dimensional interval or on a set of positive measure. Call any of these assumptions regularity. How Should We Average Scores? Janos Acz é l “Mr Functional Equations”

148 148 Theorem (Aczél and Roberts 1989): If in addition F satisfies reflexivity and symmetry, then = 1 and c 1 = c 2 = … = c n = 1/n, so F is the geometric mean. Aside: Solution to (*) without Regularity (Aczél, Roberts, and Rosenbaum): How Should We Average Scores? M i > 0, M i ( xy ) = M i ( x ) M i ( y ), ¸ > 0 F ( a 1 ; a 2 ;:::; a n ) = ¸ ¦ n i = 1 M i ( a i )

149 149 Second Example: a 1, a 2, …, a n are ratio scales, but have the same unit; u is a ratio scale. New Functional Equation: (**) F(  a 1,  a 2,…,  a n ) = A(  )F(a 1,a 2,…,a n ), A(  ) > 0 Regular Solutions to (**) (Aczél, Roberts, and Rosenbaum): f > 0, arbitrary regular function on  n-1 c arbitrary constant. (n = 1: f is a positive constant) How Should We Average Scores? F ( a 1 ; a 2 ;:::; a n ) = a c 1 f ( a 2 a 1,..., a n a 1 )

150 150 Solutions to (**) without Regularity (Aczél, Roberts, and Rosenbaum): f > 0, arbitrary function M > 0, M(xy) = M(x)M(y) How Should We Average Scores? F ( a 1 ; a 2 ;:::; a n ) = M ( a 1 ) f ( a 2 a 1,..., a n a 1 )

151 151 Third Example: a 1, a 2, …, a n are independent ratio scales, u is an interval scale New Functional Equation: (***) F(  1 a 1,  2 a 2,…,  n a n ) = A(  1,  2,…,  n )F(a 1,a 2,…,a n ) + B(  1,  2,…,  n ) A(  1,  2,…,  n ) > 0 How Should We Average Scores?

152 152 Regular solutions to (***) (Aczél, Roberts, and Rosenbaum):, b, c 1, c 2, …, c n arbitrary constants OR b, c 1, c 2, …, c n arbitrary constants How Should We Average Scores? F ( a 1 ; a 2 ;:::; a n ) = n X i = 1 c i l oga i + b F ( a 1 ; a 2 ;:::; a n ) = ¸ ¦ n i = 1 a c i i + b

153 153 Solutions to (***) without Regularity (Aczél, Roberts, and Rosenbaum): M i > 0, M i (xy) = M i (x)M i (y), b arbitrary constants OR L i (xy) = L i (x) + L i (y) b arbitrary constant How Should We Average Scores? F ( a 1 ; a 2 ;:::; a n ) = ¸ ¦ n i = 1 M i ( a i ) + b F ( a 1 ; a 2 ;:::; a n ) = n X i = 1 L i ( a i ) + b

154 154 Sometimes You Get the Arithmetic Mean Fourth Example: a 1, a 2, …, a n interval scales with the same unit and independent zero points; u an interval scale. Functional Equation: (****) F(  a 1 +  1,  a2+  2,…,  a n +  n) = A( ,  1,  2,…,  n )F(a 1,a 2,…,a n ) + B( ,  1,  2,…,  n ) A( ,  1,  2,…,  n ) > 0 How Should We Average Scores?

155 155 Functional Equation: (****) F(  a 1 +  1,  a 2 +  2,…,  a n +  n) = A( ,  1,  2,…,  n )F(a 1,a 2,…,a n ) + B( ,  1,  2,…,  n ) A( ,  1,  2,…,  n ) > 0 Solutions to (****) without Regularity Assumed (Aczél, Roberts, and Rosenbaum): 1, 2, …, n, b arbitrary constants How Should We Average Scores? F ( a 1 ; a 2 ;:::; a n ) = n X i = 1 ¸ i a i + b

156 156 Note that all solutions are regular. Theorem (Aczél and Roberts): (1). If in addition F satisfies reflexivity, then (2). If in addition F satisfies reflexivity and symmetry, then i = 1/n for all i, and b = 0, i.e., F is the arithmetic mean. How Should We Average Scores? P n i = 1 ¸ i = 1, b = 0 :

157 157 Meaningfulness Approach While it is often reasonable to assume you know the scale type of the independent variables a 1, a 2, a n, it is not so often reasonable to assume that you know the scale type of the dependent variable u. However, it turns out that you can replace the assumption that the scale type of u is xxxxxxx by the assumption that a certain statement involving u is meaningful. How Should We Average Scores?

158 158 Back to First Example: a 1, a 2, …, a n are independent ratio scales. Instead of assuming u is a ratio scale, assume that the statement F(a 1,a 2, …, a n ) = kF(b 1, b 2, …, b n ) is meaningful for all a 1, a 2, …, a n, b 1, b 2, …, b n and k > 0. Then we get the same results as before: Theorem (Roberts and Rosenbaum 1986): Under these hypotheses and regularity, If in addition F satisfies reflexivity and symmetry, then F is the geometric mean. How Should We Average Scores? F ( a 1 ; a 2 ;:::; a n ) = ¸ a c 1 1 a c 2 2 ::: a c n n :

159 159 Averaging of measurements or judgments or estimates is commonly carried out in a variety of applied areas. It is certainly relevant not only to air, water, and noise pollution, but to decision making about other kinds of pollution, such as visual pollution, thermal pollution, land pollution, radioactive pollution, etc. How Should We Average Scores?

160 160 There is much more analysis of a similar nature that can be done with similar ideas. Let your imagination and your students’ imaginations run free!

161 161 Thanks, Bill Lucas, for teaching me that and a lot more.


Download ppt "1 Measurement of Pollution Fred Roberts, DIMACS, Rutgers University."

Similar presentations


Ads by Google