September 23, 1999COMET1 How to convince your friends NOT to misuse raw scores Benjamin D. Wright Institute for Objective Measurement & MESA Psychometric.

Slides:

Advertisements

Similar presentations

Understanding Good Progress in Mathematics. Four Elements 1. Using and Applying 2. Number 3. Shape, Space and Measure 4. Data Handling.

Advertisements

Describing Quantitative Variables

The Research Consumer Evaluates Measurement Reliability and Validity

AN OVERVIEW OF THE FAMILY OF RASCH MODELS Elena Kardanova

1 Introduction to Inference Confidence Intervals William P. Wattles, Ph.D. Psychology 302.

Check it out! 1.2.2: Creating Linear Inequalities in One Variable

Section 2.2 ~ Dealing With Errors

QUICK MATH REVIEW & TIPS 3 Step into Algebra and Conquer it.

Theoretical Probability Distributions We have talked about the idea of frequency distributions as a way to see what is happening with our data. We have.

1 Probably About Probability p

The Game of Algebra or The Other Side of Arithmetic The Game of Algebra or The Other Side of Arithmetic © 2007 Herbert I. Gross by Herbert I. Gross & Richard.

Warm Up The most common question asked was ‘why does the variance have to be equal to the mean?’ so let’s prove it! First you will need to explain.

The one sample t-test November 14, From Z to t… In a Z test, you compare your sample to a known population, with a known mean and standard deviation.

Calculating Baseball Statistics Using Algebraic Formulas By E. W. Click the Baseball Bat to Begin.

McGraw-Hill Ryerson Copyright © 2011 McGraw-Hill Ryerson Limited. Adapted by Peter Au, George Brown College.

L Chedid 2008 Significance in Measurement  Measurements always involve a comparison. When you say that a table is 6 feet long, you're really saying that.

Measuring Social Life Ch. 5, pp

Descriptive Statistics Used to describe the basic features of the data in any quantitative study. Both graphical displays and descriptive summary statistics.

Grade 2 - Unit 1 Lesson 1 I can retell, draw, and solve story problems. I can recognize math as a part of daily life. Lesson 2 I can create story problems.

Estimation Basic Concepts & Estimation of Proportions

STA Lecture 161 STA 291 Lecture 16 Normal distributions: ( mean and SD ) use table or web page. The sampling distribution of and are both (approximately)

Modeling and Simulation CS 313

Copyright © 2010, 2007, 2004 Pearson Education, Inc. Review and Preview This chapter combines the methods of descriptive statistics presented in.

16-1 Copyright  2010 McGraw-Hill Australia Pty Ltd PowerPoint slides to accompany Croucher, Introductory Mathematics and Statistics, 5e Chapter 16 The.

Lecture 12 Statistical Inference (Estimation) Point and Interval estimation By Aziza Munir.

Chapter 6 Continuous Distributions The Gaussian (Normal) Distribution.

LECTURER PROF.Dr. DEMIR BAYKA AUTOMOTIVE ENGINEERING LABORATORY I.

Unaddition (Subtraction)

1 Psych 5500/6500 Standard Deviations, Standard Scores, and Areas Under the Normal Curve Fall, 2008.

ECE 8443 – Pattern Recognition LECTURE 07: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Class-Conditional Density The Multivariate Case General.

1.3 Simulations and Experimental Probability (Textbook Section 4.1)

1 Probably About Probability p

Chapter 7 Probability and Samples: The Distribution of Sample Means.

Thursday August 29, 2013 The Z Transformation. Today: Z-Scores First--Upper and lower real limits: Boundaries of intervals for scores that are represented.

Economics 173 Business Statistics Lecture 4 Fall, 2001 Professor J. Petry

Grade 8 Math Project Kate D. & Dannielle C.. Information needed to create the graph: The extremes The median Lower quartile Upper quartile Any outliers.

Mixed Numbers and Percents © Math As A Second Language All Rights Reserved next #7 Taking the Fear out of Math 275%

CHAPTER OVERVIEW The Measurement Process Levels of Measurement Reliability and Validity: Why They Are Very, Very Important A Conceptual Definition of Reliability.

Uncertainty Management in Rule-based Expert Systems

Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.

ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 07: BAYESIAN ESTIMATION (Cont.) Objectives:

CHAPTER 6 Naive Bayes Models for Classification. QUESTION????

Agresti/Franklin Statistics, 1 of 88 Chapter 11 Analyzing Association Between Quantitative Variables: Regression Analysis Learn…. To use regression analysis.

RESEARCH & DATA ANALYSIS

Chapter 3: Organizing Data. Raw data is useless to us unless we can meaningfully organize and summarize it (descriptive statistics). Organization techniques.

Scaling Session Measurement implies assigning numbers to objects or events. In our case, the numbers “weight” responses to questions, so that saying “Yes”

Chapter 7 Measuring of data Reliability of measuring instruments The reliability* of instrument is the consistency with which it measures the target attribute.

In Chapters 6 and 8, we will see how to use the integral to solve problems concerning:  Volumes  Lengths of curves  Population predictions  Cardiac.

RESEARCH METHODS IN INDUSTRIAL PSYCHOLOGY & ORGANIZATION Pertemuan Matakuliah: D Sosiologi dan Psikologi Industri Tahun: Sep-2009.

Univariate Gaussian Case (Cont.)

MATH 110 Sec 14-1 Lecture: Statistics-Organizing and Visualizing Data STATISTICS The study of the collection, analysis, interpretation, presentation and.

Validity & Reliability. OBJECTIVES Define validity and reliability Understand the purpose for needing valid and reliable measures Know the most utilized.

Essentials for Measurement. Basic requirements for measuring 1) The reduction of experience to a one dimensional abstraction. 2) More or less comparisons.

Computing Fundamentals 2 Lecture 7 Statistics, Random Variables, Expected Value. Lecturer: Patrick Browne

UNIT ONE: Science Skills  Chapter 1Measurement  Chapter 2The Scientific Process  Chapter 3Mapping Earth.

Statistics 11 Understanding Randomness. Example If you had a coin from someone, that they said ended up heads more often than tails, how would you test.

Sampling and Sampling Distribution

Statistics 200 Objectives:

Maths Information Evening

Uncertainty in Measurements

Introduction to Drafting and Design

Introduction to Drafting and Design

Sampling Distributions

Chapter 5 Describing Data with z-scores and the Normal Curve Model

Conversion Chains Key Question: Investigation 1C

How to convince your friends NOT to misuse raw scores

28th September 2005 Dr Bogdan L. Vrusias

Higher National Certificate in Engineering

Presentation transcript:

September 23, 1999COMET1 How to convince your friends NOT to misuse raw scores Benjamin D. Wright Institute for Objective Measurement & MESA Psychometric Laboratory

September 23, 1999COMET2 THE TROUBLE WITH RAW SCORES The BROKEN BUCKET of Missing Data The CRUMBLING CATEGORIES of Lumpy Ratings The DIRTY DATA of Unpredictable Responses The PERVERSE PRECISION of Extreme Scores The RUBBER RULER of Irregular Intervals and Squashed Extremes

September 23, 1999COMET3 The BROKEN BUCKET of Missing Data Compare Two Patients on an 8-item, 7-category Functional Independence Measure: –Patient A: m m m m 4 = 12 –Patient B: = 16 Which Patient is more Able? Patient B has the higher score 16 > 12 on all 8 items, BUT On the 4 Items A and B have in common –Patient A: m m m m 4 = 12 –Patient B: m m m m 3 = 8 Patient A has the higher score of 12 > 8 Are you sure you want to misuse missing-data- leaking raw scores for missing-data-impervious measures?

September 23, 1999COMET4 The CRUMBLING CATEGORIES of Lumpy Ratings Rating Forms Offer Equally Spaced Categories – But Raters Reply with Unequally Spaced Responses – The Measure Distance of One More Point from Category 1 to 2 can be FOUR times BIGGER than The Measure Distance of One More Point from Category 3 to 4 !! Are you sure you want to mistake Lumpy Ratings for Equal Interval Measures?

September 23, 1999COMET5 The DIRTY DATA of Unpredictable Responses When Item Responses are Arranged from Easy Items to Hard Items you Expect Response Patterns like: – = 46 and = 19 BUT suppose you get: – {1}5 5 4 = 41 ? or {7}1 1 1 = 24 ? What then? Raw Scores are Blind to Unpredictable Responses. Only Quality Control of Well-Constructed Measures Tells you about Response Surprises Are you sure you want to suffer raw-score-dirty-data blindness instead of enjoying data-vigilant measures?

September 23, 1999COMET6 The PERVERSE PRECISION of Extreme Scores The Statistical Precision of a Raw Score is MAXIMUM at exactly the place where the Information a Raw Score Provides is MINIMUM When a person gets the lowest possible score, their raw score precision is perfect. We know exactly the score their low ability implies. But we have no idea how far Below that Score their ability might be! It is the same with the highest possible score. We know exactly the score their high ability implies. But we have no idea how far Above that Score their ability might be! They are Off-Our-Scale and Our Precision for their Unknown Measure is ZERO! Are you sure you want to mistake imprecise raw scores for precise measures?

September 23, 1999COMET7 The RUBBER RULER of Irregular Intervals and Squashed Extremes When our items bunch in clumps of equally difficult items then a count of one more right answer within a clump implies only a little increase in our ability. But when we leap ahead and our next right answer is in a distinctly harder clump, then we see that this one more right implies a large increase in our ability. As for the ends of the test where one more right is from 0 to 1 or from all but one to all. Then the implied change in our ability is infinite.

September 23, 1999COMET8

September 23, 1999COMET9 WHAT ARE VARIABLES? Length and weight may be real variables. But we construct their units of measure. Inches and ounces are our creations - Our own imaginative constructions. A variable is an amount of something which we can always picture as a distance >From Less > To More We can arrange to experience evidence of this "something". But its measurement line and its units of measurement are up to us to construct.

September 23, 1999COMET10 EVIDENCE OF A VARIABLE? The variable and its evidence could be: –length benchmarks exceeded –health symptoms absent –ability problems solved –skill tasks completed –attitude assertions condoned We can arrange to provoke occurrences of evidence and count how many pieces occur. But these counts are not measures.

September 23, 1999COMET11 REQUIREMENTS FOR MEASUREMENT Pieces of evidence must be concrete to be observed. This necessary reality keeps them uneven in size. To measure we need an even abstraction, a line marked out in abstractly equal units. Pieces of evidence are unstable. They appear and disappear by accident. They are only probable signs of the variable which they are designed to manifest. To measure, we must find a way to connect the pieces of evidence we can arrange to observe to the probabilities of the measures we want.

September 23, 1999COMET12 WHAT IS MEASUREMENT? DISTANCE (Length) was our First Variable COUNTING Steps and Fingers was our First Measuring Operation The Trouble with Counting is its UNEQUAL UNITS How many apples fill a basket? How many oranges squeeze a glass? You may not believe it. But can we mix apples and oranges? We do it all of the time, by WEIGHING them!

September 23, 1999COMET13 CONSTRUCTING MEASURES But weighing is a constructed abstraction.There are no tangible equal units. We have to invent them. Equal feet are abstracted from real feet. Equal pounds are abstract real weights We construct our instrumentation of the variables: length and weight to approximate units equal enough to serve our practical purposes We measure so that we can use the past to plan and navigate the future. But the future is by definition UNCERTAIN

September 23, 1999COMET14 HANDLING UNCERTAINTY Imagine two batters: Smith bats 400 and Jones bats 200 So which one will hit at their next batter-up? No way to know ahead of time. Even Smith has only a 4 out of 10 record. We can’t wait to find out. So which one shall we send to the plate? Smith's odds for a hit are 2/3; Jones' odds for a hit are only 1/4 Smith odds for a hit are 8/3 times better than Jones'. Even though we know nothing for sure, Does any doubt remain as to who to send to bat? That's how we handle uncertainty. We use past experience to estimate PROBABILITIES and use these probabilities to forsee the future.

September 23, 1999COMET15 COUNTING ABSTRACT UNITS To finish this job we have to construct a reproducible transition from counting concrete events, like right answers, observed or reported symptoms, relative agreements, frequency or importance categories to counting abstract units of equal size and wide generality. How can we do this?

September 23, 1999COMET16 INVERSE PROBABILITY To deal with the uncertainty we ask Bernoulli, Bayes and Laplace and interpret our observation X as evidence of its occurence probability Px To construct unit equality we ask Campbell, Thurstone, Rasch and Luce & Tukey and define Px to satisfy the equation: log[Pnix/(1-Pnix)] = Bn - Di –Pnix is the probability of a successful response Xni being produced by person n to item i –Bn is the ability of person n –Di is the difficulty of item I The construction of equal size and hence additive units is called CONJOINT ADDITIVITY

September 23, 1999COMET17 For situations where Xni occurs in incremental steps such as Xni = 0,1,2,3,,,M This simple solution generalizes to –log[Pnix/Pnix-1] = Bn - Di - Fix For situations where Xnijk = 0,M occurs as the result of –a Rater j rating the performance of –a Person n on –a Task k This MEASUREMENT MODEL becomes –log[Pnijkx/Pnijkx-1] = Bn - Di - Cj - Ak - Fix CONJOINT ADDITIVITY

September 23, 1999COMET18 CLOSING THE DEAL Raw scores cause problems with: –missing data –lumpy ratings –unpredictable responses –extreme scores –irregular intervals and squashed extremes Rasch measurement provides a solution to these problems by: –abstracting units of measurement –using probabilities to predict futures –constructing equal-sized intervals