# What on earth is a p value, a Process sigma, Cronbach’s alpha, the Black- Scholes formula, a Priority in AHP, or the Sunday Times score for Portsmouth.

## Presentation on theme: "What on earth is a p value, a Process sigma, Cronbach’s alpha, the Black- Scholes formula, a Priority in AHP, or the Sunday Times score for Portsmouth."— Presentation transcript:

What on earth is a p value, a Process sigma, Cronbach’s alpha, the Black- Scholes formula, a Priority in AHP, or the Sunday Times score for Portsmouth University? On the interpretability of measurements based on mathematical models. Michael Wood June 2011 http://userweb.port.ac.uk/~woodm/presentations.htm

Management makes use of many measurements based on mathematical models, but these are often difficult to interpret sensibly. This talk will look at some examples of such measurements, and the consequences of the problems of their interpretation – including the employment of unnecessary academics to teach what should be obvious, and supporting the bad decisions which led to the recent financial crash. I will then discuss how these, and other, measurements could be redesigned to make them more useful and user-friendly.

I’ll look at four examples: 1. Six sigma and the process sigma measurement 2. Null hypothesis significance tests and p values 3. University league tables 4. Risk measurements and the normal (Gaussian) distribution

Four examples … with some imaginary dialogues between the expert and a naive user...

Process sigma – the measurement linked to the Six Sigma philosophy The process sigma for this process is 4.833 The process sigma for this process is 4.833 What on earth does this mean? What on earth does this mean? It means there are 430 dpmo (defects per million opportunities). Use this Sigma calculator It means there are 430 dpmo (defects per million opportunities). Use this Sigma calculatorSigma calculatorSigma calculator So why not just say 430 dpmo? Keep it simple! So why not just say 430 dpmo? Keep it simple! But this would be dumbing down. Life is difficult and we mustn’t join the modern trend of trying to make it easier. But this would be dumbing down. Life is difficult and we mustn’t join the modern trend of trying to make it easier. Why not? The complicated version adds nothing except confusing the uninitiated. (Similar comments apply to C pk.) Why not? The complicated version adds nothing except confusing the uninitiated. (Similar comments apply to C pk.)... which must be a good thing!... which must be a good thing!

p values We’ve done a survey and found that women are more intelligent than men. p value is 0.004. We’ve done a survey and found that women are more intelligent than men. p value is 0.004. What does the p value mean? What does the p value mean? It tells us how sure we can be about our results taking sampling error into account. It tells us how sure we can be about our results taking sampling error into account. 0.0002 is very small. Not very impressive! 0.0002 is very small. Not very impressive! It’s a bit difficult to explain p values to someone like you, but smaller is better. Less than 5% mean you can be fairly sure women are cleverer than men, less than 1% is almost conclusive. It’s a bit difficult to explain p values to someone like you, but smaller is better. Less than 5% mean you can be fairly sure women are cleverer than men, less than 1% is almost conclusive. Sounds like you’re trying to confuse me … Sounds like you’re trying to confuse me … Reverse measure of wrong thing, misinterpreted Reverse measure of wrong thing, misinterpreted Statman bits. User friendly units - \$/inch, etc. Statman bits. User friendly units - \$/inch, etc.

… p values I’m told that if the p value is 0.004 this means that we can be 99.8% confident that women really are more intelligent based on this data. Isn’t that a better way to put it? I’m told that if the p value is 0.004 this means that we can be 99.8% confident that women really are more intelligent based on this data. Isn’t that a better way to put it? No, that’s a common misunderstanding... you need to go on a course, although I’m not sure you’ll take it in... No, that’s a common misunderstanding... you need to go on a course, although I’m not sure you’ll take it in... There are lots of common misunderstandings, but I’m sure about the 99.8% confident... There are lots of common misunderstandings, but I’m sure about the 99.8% confident...

University League tables The Sunday Times score for Portsmouth University is 599. The Sunday Times score for Portsmouth University is 599.Sunday Times scoreSunday Times score What does that mean? What does that mean? Well … e.g. Southampton got 783 points so Southampton is obviously a better place to study Well … e.g. Southampton got 783 points so Southampton is obviously a better place to study What are the points based on? What are the points based on? Lots of things: e.g. Student satisfaction, Research quality Lots of things: e.g. Student satisfaction, Research quality So do Southampton do better on these two?... So do Southampton do better on these two?...

... University League tables Actually Portsmouth do a little better on student satisfaction (174 vs 169/250), but Southampton do better on research quality (136 vs 112/200) Actually Portsmouth do a little better on student satisfaction (174 vs 169/250), but Southampton do better on research quality (136 vs 112/200) But student satisfaction is more important to students than research quality... But student satisfaction is more important to students than research quality... You’ve got to balance the two. The experts at the Sunday Times have done this. You’ve got to balance the two. The experts at the Sunday Times have done this. But different people may want different things... But different people may want different things...

Measurements of risk Muddled Michael has a habit of losing his car keys when he goes on holiday. He reckons he has a 25% chance of losing his keys. He decides to consult an expert on risk … Muddled Michael has a habit of losing his car keys when he goes on holiday. He reckons he has a 25% chance of losing his keys. He decides to consult an expert on risk … Easy! If he takes 9 spare keys with him, then the probability of losing all 10 keys is 0.25 10 which is about one chance in a million … which seems an acceptable risk. Easy! If he takes 9 spare keys with him, then the probability of losing all 10 keys is 0.25 10 which is about one chance in a million … which seems an acceptable risk. Michael puts all 10 keys on the same key ring (he doesn’t want to confuse himself by putting them in different places) and goes on holiday. Michael puts all 10 keys on the same key ring (he doesn’t want to confuse himself by putting them in different places) and goes on holiday. The problem here is that the maths assumes that losing each key is an independent event. In fact if he loses one key he will probably lose the rest as well, so a more realistic estimate of losing all his keys is 25%! The problem here is that the maths assumes that losing each key is an independent event. In fact if he loses one key he will probably lose the rest as well, so a more realistic estimate of losing all his keys is 25%! There are similar assumptions underlying most risk calculations – but if the calculations are more complicated it is easy not to notice. There are similar assumptions underlying most risk calculations – but if the calculations are more complicated it is easy not to notice.

Risk and the weather The probability of more than 1 mm of rain falling in Southampton in one day is 31.5% The probability of more than 1 mm of rain falling in Southampton in one day is 31.5% (Estimated from Met Office graph based on 1971-2000 data.) Met Office graphMet Office graph Then, theoretically, the probability of a week when it rains every day is 0.315 7 which suggests that this happens about every 9 years. Then, theoretically, the probability of a week when it rains every day is 0.315 7 which suggests that this happens about every 9 years. –Two weeks with rain every day is a “once in 29000 years” event. Almost certainly happens more often – last time was 20- 30 November 2009, and the time before was 10-16 of the same month Almost certainly happens more often – last time was 20- 30 November 2009, and the time before was 10-16 of the same month (Southampton Weather website) Southampton Weather websiteSouthampton Weather website The theory is wrong because the assumptions are wrong! The theory is wrong because the assumptions are wrong!

Risk and the normal distribution Very similar assumptions underlie the normal (Gaussian) distribution. This assumes that the variable depends on a large number of small independent factors. If not the predictions can be misleading especially for rare events Very similar assumptions underlie the normal (Gaussian) distribution. This assumes that the variable depends on a large number of small independent factors. If not the predictions can be misleading especially for rare events Many finance measurements depend on the normal distribution and similar assumptions – e.g. Black Scholes formula. OK in normal times, but tends to seriously underestimate the probability of big falls. Many finance measurements depend on the normal distribution and similar assumptions – e.g. Black Scholes formula. OK in normal times, but tends to seriously underestimate the probability of big falls. If the Dow Jones Industrial average moved in accordance with a normal distribution, it would have moved by 4.5% or more on only six days between 1996 and 2003 …. In reality … 366 times” (Mandelbrot cited by Buckley, 2011, p. 140). If the Dow Jones Industrial average moved in accordance with a normal distribution, it would have moved by 4.5% or more on only six days between 1996 and 2003 …. In reality … 366 times” (Mandelbrot cited by Buckley, 2011, p. 140). Black Monday (1987) was a 20 sd event, once in a million year event, experienced several times by people much young than a million years (Buckley, 2011, 141). Black Monday (1987) was a 20 sd event, once in a million year event, experienced several times by people much young than a million years (Buckley, 2011, 141). Measures “understood” but not assumptions … trust in a misunderstood version … Measures “understood” but not assumptions … trust in a misunderstood version …

What can go wrong? 1. Unnecessary time and effort expended –E.g. 50% of time spent on stats courses could be saved by redesigning concepts? Big savings in time and effort possible! 2. Failure to understand a)Complete b)Subtleties 3. Misunderstanding a)Of basic concept b)Of assumptions leading to misleading uses

... for example... P values P values –Massive amount of wasted time and energy (think of all those journal articles), general confusion, misinterpretations like significant=important University league tables University league tables –scores taken too seriously, specific requirements ignored, creates uniformity because everyone thinks the same; rational world would be more varied Risk Risk –ignoring unrealistic assumptions led to over- confidence in mathematical measures which helped the financial crash...

Principles for designing measurements for understanding Remember most measurements determined by historical accident – therefore can probably be improved for current users and uses. Design not discovery. Remember most measurements determined by historical accident – therefore can probably be improved for current users and uses. Design not discovery. Name should reflect meaning of result, not the method used to get there Name should reflect meaning of result, not the method used to get there Make sure the direction is intuitive, use units and percentages as appropriate Make sure the direction is intuitive, use units and percentages as appropriate Must be an accurate description of meaning of measurement in users’ language Must be an accurate description of meaning of measurement in users’ language Users must understand key assumptions (which are not irrelevant technicalities). If possible users should follow general idea of derivation. Users must understand key assumptions (which are not irrelevant technicalities). If possible users should follow general idea of derivation.

Reasons for the persistence of strange measurements Aim often ticking a box, not understanding Aim often ticking a box, not understanding –Users don’t see problem Interests of experts and teachers Interests of experts and teachers –Mystification is good for business! Some measurements (e.g. process sigma) invented solely for this purpose? The dumbing down myth The dumbing down myth –Increased user-friendliness should lead to more, not less, powerful use of measurements –We need to dumb up so that even the dumb won’t do dumb things

References Buckley, Adrian (2011). Financial Crisis: causes, context and consequences. Harlow: Pearson Education. Buckley, Adrian (2011). Financial Crisis: causes, context and consequences. Harlow: Pearson Education. I Six Sigma (2011). Sigma calculator available at http://www.isixsigma.com I Six Sigma (2011). Sigma calculator available at http://www.isixsigma.com http://www.isixsigma.com Met Office graph Met Office graph Met Office graph Met Office graph Southampton Weather website Southampton Weather website Southampton Weather website Southampton Weather website

Download ppt "What on earth is a p value, a Process sigma, Cronbach’s alpha, the Black- Scholes formula, a Priority in AHP, or the Sunday Times score for Portsmouth."

Similar presentations