# The Right Questions about Statistics: How hypothesis testing works Maths Learning Centre The University of Adelaide A hypothesis test is designed to DECIDE.

## Presentation on theme: "The Right Questions about Statistics: How hypothesis testing works Maths Learning Centre The University of Adelaide A hypothesis test is designed to DECIDE."— Presentation transcript:

The Right Questions about Statistics: How hypothesis testing works Maths Learning Centre The University of Adelaide A hypothesis test is designed to DECIDE the answer to a YES OR NO question using DATA. You calculate whether your data is likely or unlikely given a particular answer.

Is the median number of chapters in a novel = 20? Randomly choose 12 books

That’s too hard. So let’s turn all this data into just one number... To answer this, I need to know what all the other possibilities are and compare... Book# Chapters 1 21 2 16 3 14 4 18 5 14 6 16 7 19 8 15 9 17 10 13 11 19 12 20 c = 0.91 0.20.40.60.81.0 How likely is this to happen if the median for all books really is 20? “TEST STATISTIC”

0.20.40.60.81.0 Suppose the median number of chapters for all books really is 20. And imagine what would happen if we took a different sample. “NULL HYPOTHESIS”

Book# Chapters 1 22 2 3 17 4 15 5 19 6 17 7 18 8 20 9 25 10 22 11 19 12 21 c = 0.55 0.20.40.60.81.0

Book# Chapters 1 23 2 17 3 26 4 21 5 25 6 21 7 24 8 21 9 12 10 20 11 15 12 21 c = 0.27 0.20.40.60.81.0

Book# Chapters 1 16 2 22 3 15 4 14 5 18 6 17 7 19 8 30 9 21 10 20 11 29 12 22 c = 0.55 0.20.40.60.81.0

Book# Chapters 1 16 2 14 3 29 4 15 5 25 6 14 7 15 8 17 9 13 10 25 11 23 12 20 c = 0.64 0.20.40.60.81.0

Book# Chapters 1 14 2 21 3 20 4 19 5 26 6 15 7 19 8 12 9 27 10 18 11 22 12 26 c = 0.55 0.20.40.60.81.0

Book# Chapters 1 15 2 3 21 4 20 5 12 6 21 7 22 8 18 9 15 10 11 13 12 15 c = 0.73 0.20.40.60.81.0

Book# Chapters 1 29 2 18 3 16 4 25 5 13 6 26 7 30 8 20 9 29 10 12 11 24 12 25 c = 0.36 0.20.40.60.81.0

Book# Chapters 1 20 2 18 3 12 4 21 5 17 6 15 7 17 8 18 9 30 10 13 11 23 12 15 c = 0.73 0.20.40.60.81.0

Book# Chapters 1 32 2 19 3 28 4 20 5 14 6 25 7 27 8 24 9 12 10 19 11 21 12 15 c = 0.45 0.20.40.60.81.0

Book# Chapters 1 25 2 17 3 4 23 5 17 6 16 7 23 8 18 9 23 10 20 11 13 12 15 c = 0.64 0.20.40.60.81.0

Book# Chapters 1 ## 2 3 4 5 6 7 8 9 10 ## 11 ## 12 ## c = 0.27 0.20.40.60.81.0 c = 0.55c = 0.64c = 0.45 c = 0.18c = 0.82c = 0.55

0.20.40.60.81.0 This describes all the possibilities for what the test statistic could have been and how likely they all are. We can usually go straight to this distribution if we use some statistical theory “DISTRIBUTION OF TEST STATISTIC”

0.20.40.60.81.0 “P VALUE” p = 0.01 NOW: How likely is my test statistic if the median for all books really is 20? That is, what percentage of the possible test statistics are just as bad as mine? I need to chose a cut-off for when I declare this to be unlikely. Let’s choose 0.05. p = 0.01 is less than 0.05 So our test statistic unlikely. So conclude the median is not 20.

Book# Chapters 1 21 2 16 3 14 4 18 5 14 6 16 7 19 8 15 9 17 10 13 11 19 12 20 Is the median number of chapters = 20? Figure out the distribution if the median really is 20. p = 0.01 Take a random sample of books. Calculate a test statistic. Calculate a P-value. c = 0.91 Decide the answer.The median for all books is not 20. “HYPOTHESIS TEST” Let’s go over that again… “SIGN TEST” 0.20.40.60.81.0 B(11,0.5)

BookWeight (g) 1 242 2 366 3 424 4 312 5 307 6 238 7 317 8 265 9 317 10 314 11 217 12 379 -2.00.01.02.0 Is the mean weight equal to 300 g? Figure out the distribution if the mean really is 300g. p = 0.65 Take a random sample of books. Calculate a test statistic. Calculate a P-value. t = 0.46 Decide the answer.The mean for all books could be 300g. “ONE-SAMPLE T-TEST” Another hypothesis test t (11)

BookTitle has “the” 1 No 2 Yes 3 No 4 5 6 Yes 7 No 8 9 10 No 11 Yes 12 No 246810 Is the percentage of books with “the” in the title 50% ? Figure out the distribution if the percentage really is 50%. p = 0.09 Take a random sample of books. Calculate a test statistic. Calculate a P-value. x = 3 Decide the answer.The percentage with “the” could be 50%. “EXACT TEST FOR ONE PROPORTION” Another hypothesis test B(12,0.5)

BookWeight (g) 1 242 2 366 3 424 4 312 5 307 6 238 7 317 8 265 9 317 10 314 11 217 12 379 510152025 Is the standard deviation of weights equal to 40g? Figure out the distribution if the standard deviation really is 40g. p = 0.03 Take a random sample of books. Calculate a test statistic. Calculate a P-value. χ 2 = 22.9 Decide the answer. The standard deviation of weights is not 40g. “CHI-SQUARED TEST FOR ONE STANDARD DEVIATION” Another hypothesis test χ 2 (11)

Are SciFi books the same thickness as other books? Take a random sample of books. BookGenreThickness (cm) 1 SciFi 4.2 2 Other 2.4 3 SciFi 3.6 4 Other 2.7 5 SciFi 3.9 6 Other 2.9 7 SciFi 3.7 8 Other 3.0 9 SciFi 4.0 10 Other 2.0 11 SciFi 3.6 12 Other 2.8 “TWO-SAMPLE T-TEST” 0264-6-2-4 Figure out the distribution if the average thickness really is the same. p = 0.0001 Calculate a test statistic. Calculate a P-value. t = 6.62 Decide the answer.SciFi books are not the same thickness as other books. Another hypothesis test t(20)

Are SciFi books more likely to be written by men? Take a random sample of books. BookGenreAuthor Gender 1 SciFi Male 2 Other Female 3 SciFi Male 4 Other Male 5 SciFi Female 6 Other Male 7 SciFi Male 8 Other Female 9 SciFi Male 10 Other Female 11 SciFi Male 12 Other Female “CHI-SQUARED TEST FOR ASSOCIATION” 681210042 Figure out the distribution if the average thickness really is the same. p = 0.24 Calculate a test statistic. Calculate a P-value. χ 2 = 1.37 Decide the answer.SciFi books are not more likely to be written by men. Another hypothesis test χ 2 (1)

So this is how you do a hypothesis test: Have a yes-or-no question. Collect data. Calculate a test statistic. Figure out the distribution if you assume a particular answer. Calculate a p-value. Decide the answer based on the p-value.

And this is what a hypothesis test means: It tells you if your data is likely or unlikely given a particular situation (“null hypothesis”). A low p-value means your data is unlikely and you don’t believe you’re in that situation. A high p-value means your data is likely and you do believe you could be in that situation.

Download ppt "The Right Questions about Statistics: How hypothesis testing works Maths Learning Centre The University of Adelaide A hypothesis test is designed to DECIDE."

Similar presentations