# AN EXPLANATION TO THOSE WHO HAVE PROBLEMS UNDERSTANDING HOW IT IS POSSIBLE TO MEASURE VARIABILITY OVER PROBABILITIES: Let two persons, A and B, play chess.

## Presentation on theme: "AN EXPLANATION TO THOSE WHO HAVE PROBLEMS UNDERSTANDING HOW IT IS POSSIBLE TO MEASURE VARIABILITY OVER PROBABILITIES: Let two persons, A and B, play chess."— Presentation transcript:

AN EXPLANATION TO THOSE WHO HAVE PROBLEMS UNDERSTANDING HOW IT IS POSSIBLE TO MEASURE VARIABILITY OVER PROBABILITIES: Let two persons, A and B, play chess. Let us assume that in the long run A wins two games out of three (for simplification we ignore draws). Let player B arbitrarily take the role of the origin of our scale. We define - for the present - the basic unit of the scale as the difference between A and B, and therefore as the differences in the probability of winning (to simplify again, we ignore that the probability of winning is here reciprocal). Now let all people of a finite population play chess against each other and assume further that there is only one dimension involved in winning a chess game. Surely we will find a person C who wins two out of three games on the long run against player A. Now player C is again one unit or one level ‘better than’ player A and two units or levels better than player B. We can go on until there is a group of persons in which no one is able to win at least two out of three games against another member of this group of excellent chess players. On the other side of the dimension (our poor chess players), there may be a person D who loses two games out of three against player B. Here, too, we can go on as in the first case until there remains a group in which nobody wins two out of three games against another member of the group of poorest chess players. Because we have only a finite number of levels of ability, we are able to count them. This number is a simple index for the differences between people or their variability (Müller, 2001). We can apply the same scaling method for the same sample for another ability, say playing tennis. Again, we are able to count the levels of ability. Now, we can compare the number of levels of both abilities, because the basic unit of comparison is the same. Although there is no reason why a group of persons should not show the same variability in different abilities, it is quite improbable. THREE CONSEQUENCES FOR PERSONALITY PSYCHOLOGY I TEST/TRAIT DESCRIPTION: With the interpretation of Rasch variances as a standardized and comparable measure of variability, Rasch variances become a basic descriptive aspect about the measured trait. In Figure 4, the Rasch variances from two different psychological dimensions are given: (1) motivation, as measured by the the Multi-Motiv-Gitter (MMG, Schmalt, Sokolowski & Langens, 2000, N=205; r it =.81; number of items=47, reduced for comparison), and (2) an aspect of intelligence, as measured by the Standard Progressive Matrices (SPM, 2002, n=205, reduced for comparison from an N of 1500; r it =.83; number of items=47, reduced for comparison). The comparison of the estimated person parameter distributions shows that people differ more in a cognitive aspect of behaviour (RV=1.16) than in an motivational one (RV=0.82), which is highly significant with Bartlett’s test of homogeneity of variances (df = 1, chi = 12.03, p<.0005). FIGURE 4 Different rasch variances of the SPM and MMG II DIAGNOSTICS AND INTERPRETATION OF PERSON ABILITIES ON NORM SCALES. One purpose of testing is to locate the position of a person in a reference group. The distance from the mean should be expressed in comparable Rasch Units. Figure 5 illustrates two hypothetical psychological dimensions with different rasch variances. In the first case we assume that a test score lies, for example, in the last 5% of speech ability and similarly in the last 5% in the arithmetic ability. FIGURE 5 Influence of comparable rasch variances on diagnostic decisions. In the first case the deviation from the mean does not seem to be severe, because the test score is only two Rasch Units away from the mean of the reference group. But in the second case, the test score is four levels away from the mean. The measure of a person´s ability should therefore be stated not only in known standardized scores, but also in comparable Rasch Units. III A NEW WAY TO TEST CONVERGENT CONTENT VALIDITY Traditional methods of testing content and convergent validity are based on standardized score variance (Cronbach, 1951, 1971; MTMM, Campbell & Fiske, 1959). With comparable Rasch variances and the concept of representative item sampling out of the universe of a valid item population (Klauer, 1984; Fitzpatrick, 1983) content validity can also be tested by comparing Rasch variances from different tests over statistical tests of homogeneity of variances (Bartlett, 1954; for a further discussion on testing equivalent variances, see Olejnik & Algina, 1988). THE QUESTION: To date, the variances of two different measures used to measure a trait cannot be compared because a shared reference scale for both distributions is missing (see Figure 1). The basic assumption from Müller (2001) is that on different psychological dimensions, the degrees to which people differ on those dimensions are not the same. FIGURE 1 Two person-parameter distributions on an unknown shared scale THE PROBLEM: All common raw score transformations (sigma values, percent range, stanine, etc.) force any distribution to the same standardized variance (see Figure 2) FIGURE 2 Raw-score transformation to a standard distribution with fixed variance THE APPROACH: To solve the problem stated above, we have to assure that 2 assumptions are met: IAn interval scale within each person parameter scale (‚Within-Assumption‘) II An identical unit-definition to allow comparisons between person parameter scales (‚Between –Assumption‘) Müller (2001) found that these assumptions are met (see Figure 3) within the Rasch model (see equation 1). The basic idea is to interpret a Rasch Unit as a defined difference of probabilities of two persons solving a reference item. This is illustrated in Figure 3. The difference in solving an item coming from two person parameters stays constant if the item on which the comparison takes place moves in its difficulty with one of the person parameters. EQUATION 1 The Rasch variance RV (equation 2) could therefore be interpret as a standardized measure of the variability of person parameters. EQUATION 2[Rasch Units] FIGURE 3: Relationsship between the unit of the person parameter scale and constant differences in the probability of solving an reference item The Contribution of the Interpretation of Rasch Variance to Personality Psychology Dr. Jörg M. Müller Universität Tübingen (http://www.joergmmueller.de/default.htm)http://www.joergmmueller.de/default.htm SHARED SCALE Density Distribution of person parameters on psychological dimension I Distribution of person parameters on psychological dimension II Frequency 01020304050 Person Variability in Trait U Raws-scores in test A Frequency 051015202530 Person Variability in Trait V Raw-scores in test B RASCH UNITS or LEVELS OF ABILITY Density Distribution of person parameters in speech ability Distribution of person parameters in arithmetic ability DaDa MEAN DsDs INDIVIDUAL TESTSCORE OF PERSON IN SPEECH ABILITY INDIVIDUAL TESTSCORE IN ARITHMETIC person parameter probability of solving an item item i with  = 0  B  A item m with  = 1  C Difference between two persons in probability of solving an item with one Rasch Unit distance MMG SPM sigma-scores Density

Download ppt "AN EXPLANATION TO THOSE WHO HAVE PROBLEMS UNDERSTANDING HOW IT IS POSSIBLE TO MEASURE VARIABILITY OVER PROBABILITIES: Let two persons, A and B, play chess."

Similar presentations