Presentation on theme: "Mathematical Ideas that Shaped the World Bayesian Statistics."— Presentation transcript:
Mathematical Ideas that Shaped the World Bayesian Statistics
Plan for this class Why is our intuition about probability so bad? What is the chance that two people in this room were born a few days apart? What is conditional probability? If someones DNA is found at a crime scene, what is the chance they are guilty? How can we spot bad statistics in the media?
An unfortunate truth Humans have an extraordinarily bad intuition about probability.
Winning the lottery What do you think your chances of winning the lottery are? Say whether winning the lottery is more or less likely to happen than this collection of events…
Is winning the lottery more or less likely? Chance of getting 12 heads in a row when flipping a fair coin. LESS MORE 1 in 4,096
Is winning the lottery more or less likely? Dying from a road accident in 1 year LESS MORE 1 in 24,000
Is winning the lottery more or less likely? Dying in the next flight you take LESS MORE 1 in 25 million
Is winning the lottery more or less likely? Being struck by lightning LESS MORE 1 in 1 million
Is winning the lottery more or less likely? Dying from a shark attack LESS MORE 1 in 300 million
Is winning the lottery more or less likely? Dying in the next hour from any causes whatsoever LESS MORE 1 in 2 million
Conclusion Winning the lottery has surprisingly bad odds: 1 in 13,983,816. Yet many people are convinced that this could one day be likely to happen to them. We mix up the probability of someone winning the lottery (which is quite likely) with the probability of us winning the lottery.
The birthday problem How many people need to be a room together so that there is a more than 50% chance of two people having the same birthday? A) 300B) 183 C) 91D) 23
Number of people Probability that 2 people share a birthday % % % % 5097% 5799% % % %
The birthday graph
In this room? What is the chance that two people in this room have birthdays less than 3 days apart (ignoring the year?) Answer: more than 50%
Monty Hall Behind 1 door is a sheep. Behind the other 2 doors are other, non-sheepy, animals. You choose a door. I open a different door showing a non-sheep. Given the choice now of sticking with your choice or switching, what should you do?
Suppose you choose Door 1… Door 1Door 2Door 3StickSwitch Sheep!Not a sheep Sheep!No sheep Not a sheep Sheep!Not a sheep No sheepSheep! Not a sheep Sheep!No sheepSheep! If you stick with your choice, you only win 1 time out of 3.
Conditional probability Conditional probability is the chance of something happening given that another event has already happened. For example: you throw two dice. What is the probability of the first die being a 6 given that the sum of the two dice is 8? What if the sum of the two dice was 6 or 7?
How to think about conditional probability Conditional probability is all about updating your odds in light of new evidence. There are a priori odds – the initial probability of an event. E.g. the probability of rolling a 6 is a priori 1 in 6. After new evidence, you have a posteriori odds. E.g. the probability of having a 6, given that the sum of two dice is 8, is 1 in 5.
Boy or girl? I know a friend who has 2 children. At least one of the children is a boy. What is the chance that the other child is also a boy? Answer: 1 in 3
Explanation A priori, there are 4 possible combinations of children: Boy – Boy Boy – Girl Girl – Boy Girl - Girl From our new evidence, we know that Girl- Girl is not possible, leaving only 3 options. Of these 3 options, only one of them is Boy- Boy.
A paradox? If you know that the oldest child is a boy, the probability of the other child being a boy is 50%. If you know that the youngest child is a boy, the probability of the other child being a boy is 50%. Surely the first boy must be either the youngest or the oldest?!
Homework I know a friend who has two children. At least one of the children is a boy who was born on a Tuesday. What is the chance that the other child is also a boy?
Confusion of the inverse People have a tendency to assume that a conditional probability and its inverse are similar. For example: If sheep enjoy eating grass, then an animal who likes grass is likely to be a sheep. If most accidents happen within 20 miles of home, then you are safest when you are far from home.
Manipulating statistics A. Taillandier (1828) found that 67% of prisoners were illiterate. What stronger proof could there be that ignorance, like idleness, is the mother of all vices? But what proportion of illiterate people were criminals?
Bayesian statistics The first person we know who looked seriously into conditional probabilities was Thomas Bayes. He was the first person to write down a formula connecting the two inverse conditional probabilities. Bayesian statistics is all about updating the odds of an event after receiving new evidence.
Thomas Bayes (1702 – 1761) Son of a London Presbyterian minister. Studied logic and theology at the University of Edinburgh. In 1722 returned to London to assist his father before becoming a minister of his own church in Tunbridge Wells, Kent, in 1733.
Thomas Bayes (1702 – 1761) During his lifetime, Bayes only published two papers. One was on Divine Benevolence. The other was a defence of The Doctrine of Fluxions against the attack of George Berkeley. His most famous paper was published in 1764, called An Essay towards solving a problem in the Doctrine of Chances.
Bayes Theorem P(A) is the prior probability of A. P(B) is the prior probability of B. P(A|B) is the probability of A happening, given that B has happened. P(B|A) is the probability of B happening, given that A has happened.
Importance of Bayes Theorem Bayes Theorem is especially useful in medicine and in law. Most doctors get the following question wrong. Lets see what you think!
A test for breast cancer 1% of women aged 40 will get breast cancer. Out of the women who have breast cancer, 80% of them will have a positive test result. Out of the women who dont have breast cancer, 10% of them will get a positive result. If a woman tests positive for breast cancer, what is the chance she has actually has it?
Doing the numbers Consider 10,000 women. 100 of them will have breast cancer. 80 of them test positive 20 of them test negative 9900 of them dont have breast cancer. 990 of them test positive 8910 of them test negative In total there are (80+990) = 1070 positive results, of which only 80 have cancer. Thats 7.4%.
The prosecutors fallacy Suppose a prosecutor in a court case finds a piece of evidence – e.g. a DNA sample. They argue that the probability of finding this evidence if the defendant were innocent is tiny. Therefore the defendant is very unlikely to be innocent. Where is the fallacy in this argument?
The prosecutors fallacy If the a priori chance of the defendants guilt is very low, then it will still be very low after presentation of this evidence. Just like with the cancer example, a false positive may be much more likely than a true positive in the absence of other evidence.
Exhibit 1: Sally Clark, 1999 Convicted of murdering both her sons. Paediatrician Roy Meadow argued that the chance of both children dying naturally was 73 million to 1. Didnt take into account that double murder would have been more unlikely. Conviction overturned in 2003.
Exhibit 2: Denis Adams, 1996 Convicted of rape based on DNA found at the scene of the crime. Probability of a match said to be 1 in 20 million. There was no other evidence to convict: victim did not identify Adams in a line-up and Adams had an alibi. The defence team instructed the jury in the use of Bayes Theorem. The judge questioned its appropriateness. After 2 appeals, Adams is still convicted.
A rule against Bayes In 2010 a convicted killer known as T appealed against his conviction. Part of the evidence was based on the special markings on his Nike trainers. The data on how many pairs of such trainers existed was unreliable. It has now been ruled that Bayes Theorem is not allowed in court unless the underlying statistics are firm.
Quotes of statistics 98% of all statistics are made up The average human has one breast and one testicle. Statistics are like bikinis. What they reveal is suggestive, but what they conceal is vital. There are three kinds of lies: lies, damned lies, and statistics.
Misuse of statistics We are going to look at some examples of bad statistics in the media. What things should we look out for to spot bad maths and stats?
Strange patterns Matt Parker, of Queen Mary University London, look at 800 ancient sites. 3 sites, around Birmingham, formed a perfect equilateral triangle. Extending the base of this triangle links up 2 more sites, more than 150 miles apart, with an accuracy of 0.05%.
What to watch out for Events assumed to be independent (e.g. 6 double yolks article). Patterns found using large amounts of data (e.g. ancient sat-nav article) Other factors not taken into account (e.g. perfect whist deal article) Confusion of the inverse Omission of relevant data Misleading labelling of graphs
Lessons to take home Dont play the lottery. Think very carefully when you are asked a question about probability. Dont confuse conditional probabilities with their inverses. Ask questions whenever you see statistics in the media! (And write in to report bad journalism!)