Presentation on theme: "Chapter 1 Introduction to the Statistical Process"— Presentation transcript:
1Chapter 1 Introduction to the Statistical Process
2Statistics vs. Anecdotal Evidence Section 1.1: Introduction to StatisticsStatistics vs. Anecdotal EvidenceSmoking causes cancer.Seat belts save lives.
3Autism and VaccinesNelson says it wasn't long after her son Parker's shots at 15 months that she noticed something was wrong."He had run a slight fever after the vaccinations, but i didn't think anything of it," said Nelson. "You know kids run fevers all the time, but about a week after that he just completely stopped talking."After months of worrying, wondering, and going back and forth with doctors, an official diagnosis was made: autism.Nelson believes it started with the vaccines."Gradually, I started piecing it together. He got sick after his vaccinations and about a week later everything changed. He was a completely different little boy then," said Nelson.
4What is Statistics?Statistics the discipline that guides us to produce or collect data which is then analyzed in order to draw inferences or make predictions.Numerical summaries such as means, percentages, and standard deviations are called statistics.
5Descriptive Statistics Descriptive Statistics refers to methods for summarizing data. These summaries consist of graphs (histograms, scatterplots, pie charts, etc.) and numbers (means, standard deviations, regression equations, percentages, etc.).
6Inferential Statistics Inferential statistics refers to methods of making decisions or predictions about a population or a process, based on data obtained from a sample. We will use tests of significance and confidence intervals to achieve this.
7This semester, we will be looking at and conducting a number of studies
8Statistical Process Logic of Inference Scope of - Significance - Estimation- Generalize- Cause/Effect7. Communicate findings1. Ask a research questionResearch Conjecture2. Design a study3. Collect data4. Explore the data5. Draw inferences6. Formulate conclusions
9Physicians’ Health Study I 1. Research Question: Will taking aspirin help reduce heart attacks?2. Design Study: Started in 1982 with 22,071 male physicians.Half took a 325mg aspirin every other day (the other half took a placebo)
10Physicians’ Health Study I 3. Collect Data:Intended to go until 1995, the aspirin study was stopped in 1988 after 189 heart attacks occurred in the placebo group and 104 in the aspirin group.Hoped to be a wonder drug, it was found there was no benefit or harm from beta carotene. This result allowed investigators to turn to other, more promising agents.
11Physicians’ Health Study I 4. Explore Data: 1.7% in the placebo group had heart attacks while only 0.9% in the aspirin group had heart attacks. (45% reduction in heart attacks for the aspirin group) 5. Draw Inferences: The likelihood of the difference between the proportions of heart attacks in each group being as large as it was just by chance is very, very small.
12Physicians’ Health Study I 6. Formulate Conclusions: They concluded that taking aspirin does reduce the likelihood of heart attacks in middle-age and older males. 7. Report Findings:
13TerminologyThe individual entities on which data are recorded are called observational units.The recorded characteristics of the observational units are the variables of interest.What are the observational units and variables in the Physician’s Health Study?
14Logic of Statistical Inference Section 1.2Introduction to theLogic of Statistical Inference
15Dolphin Communication Can dolphins communicate abstract ideas?In an experiment done in the 1960s, Doris was instructed which of two buttons to push. She then had to communicate this to Buzz (who could not see Doris). If he picked the correct button, both dolphins would get a reward.What are the observational units and variables in this study?
16Dolphin Communication In one set of trials, Buzz chose the correct button 15 out of 16 times.Based on these results, do you think Buzz knew which button to push or is he just guessing?How might we justify an answer?How might we model this situation?
19Exploration 1.2: Can Chimps Solve Problems? Sarah, a 30 year-old chimp, is shown videos of a person struggling with some problem. (can’t reach a banana, cage door locked, record player not working, etc.)She is then shown two pictures. One of the solution and one not.She then picks one of the pictures.Does Sarah understand the solution to these problems or is she just randomly picking a picture?
20Exploration 1.2 (pg 15) Read the first paragraph. State the research question. (This is a broad statement.)State the research conjecture. (This is more specific to our test.)Sarah correctly picked 7 of the 8 pictures. Is this unlikely if she is just guessing?Continue working on the exploration.
21Section 1.3Statistical Significance: Other Random Choice Models
22Can dogs sniff out cancer? Marine sniffing samples
23Can Dogs Sniff Out Cancer? 1. Research Question: Can dogs detect a patient with cancer by smelling their breath?2. Design a study: Five breath bags were shown to Marine, one from a cancer patient and four from non-cancer patients.3. Collect data: Marine completed 33 attempts at this procedure.4. Explore the data: Marine identified the correct bag 30 out of 33 times.
24Can Dogs Sniff Out Cancer? How is the chance model we will use for this situation different than our previous ones?Can we use coins again?
25Can Dogs Sniff Out Cancer? 5. Draw InferencesThree S StrategyStatistic: Compute the statistic from the observed data.Simulate: Identify a model that represents a chance explanation. Use the model to simulate data that “could have happened” when the chance model is true. Calculate the value of the statistic from the could-have-been data. Repeat the simulation process to generate a distribution of the could-have-been values for the statistic.Strength of evidence: Consider whether the value of the observed statistic is unlikely to occur when the chance model is true.
26Can Dogs Sniff Out Cancer? We have the statistic. Marine made the correct identification 30 out of 33 times.How could we set up a simulation?Tactile (how could this be done?)AppletStrength of evidence. Is 30 out of 33 very unlikely under the chance model?
27Can Dogs Sniff Out Cancer? 6: Formulate conclusions:Can we conclude that marine can identify cancerous breath?Can we conclude that all dogs can do this? Some dogs?7: Communicate findings:Marine, the dog that can sniff out bowel cancerBy Jeremy Laurance, Health EditorA labrador retriever called Marine has been trained to sniff out cancer with stunning accuracy, researchers report today.
28Terminology: Hypotheses The null hypothesis is the chance explanation.Typically the alternative hypothesis is what the researchers think is true.Null hypothesis: Marine is randomly choosing which bag to sit next to.Alternative hypothesis: Marine is not randomly choosing which bag to sit next to.
29Terminology: Null Distribution We will refer to the distribution of chance outcomes as the null distribution.For Marine, we should have gotten a null distribution similar to the following.
30Terminology: P-valueThe p-value as the proportion of outcomes in the null distribution that are at least as extreme as the value of the statistic actually observed in the study.What was our p-value for Marine?Were they all the same?Were they all close to the same?
31Guidelines for evaluating strength of evidence from p-values p-value >0.10, not much evidence against null hypothesis0.05 < p-value < 0.10, moderate evidence against the null hypothesis0.01 < p-value < 0.05, strong evidence against the null hypothesis0.001 < p-value < 0.01, very strong evidence against the null hypothesisp-value < 0.001, extremely strong evidence against the null hypothesis
32Terminology: Statistically Significant If the observed results provide strong evidence that the data did not arise by random chance alone then the research result is called statistically significant.Are Marine’s results statistically significant?
33Let’s play some rock-paper-scissors Rock smashes scissorsPaper covers rockScissors cut paperPlay the novice version at least 30 times and keep track of all your choices.
35Criminal Justice System vs. Significance Tests Innocent until proven guilty. We assume a defendant is innocent and the prosecution has to collect evidence to try to prove the defendant is guilty.Likewise, we assume our chance model (or null hypothesis) is true and we collect data and calculate a sample proportion. We then show how unlikely our proportion is if the chance model is true.
36Criminal Justice System vs. Significance Tests If the prosecution shows lots of evidence that go against this assumption of innocence (DNA, witnesses, motive, contradictory story, etc.) then the jury concludes that the defendant the innocence assumptions is wrong.If after we collect data and find that the likelihood (p-value) of such a proportion is so small that it would rarely occur by chance if the null hypothesis is true, then we conclude our assumption of the chance model being true is wrong.
37ReviewFor Sarah the chimp, you could have gotten a null distribution similar to the one shown here.What does a single dot represent?What does the whole distribution represent?What is the p-value for this simulation?What does this p-value mean?
38More Review The null hypothesis is the chance explanation. Typically the alternative hypothesis is what the researchers think is true.Three S StrategyStatistic, Simulate, Strength of evidenceThe p-value as the proportion of outcomes in the null distribution that are at least as extreme as the value of the statistic actually observed in the study.
39Still More ReviewA small p-value gives evidence against the null and for the alternative.If the observed results provide strong evidence that the data did not arise by random chance alone then the research result is called statistically significant.
41Ron Artest, choker at the line? In the basketballSeason Ron Artest made68.8% of his free throws, similar to his career average.In his first 15 attempts in the playoffs, he only made 7 free throws. (46.7%)Is this evidence that he is “choking” and performing significantly worse than during the regular season?
42Ron Artest Example What are the observational units? Artest’s 15 free throw attempts.What is the variable?Whether or not he makes the free throw.What is the statistic of interest?7/15
43NotationOur sample proportion (statistic) can be described using the symbol 𝑝 (p-hat).A parameter is a numerical summary of a variable that is either an unobservable long-run outcome or a value for an entire population. It can be described using the symbol 𝜋 (pi).In our example, 𝜋=0.688 and 𝑝 =0.467.
44HypothesesNull hypothesis: Ron Artest’s performance at the free throw line during the 2010 NBA finals is the same as his regular season performance; his probability of making a basket in the playoffs isAlternative hypothesis: Ron Artest’s performance at the free throw line during the 2010 NBA finals is worse than his regular season performance; his probability of making a basket in the playoffs is less than
45Simulated Chance Model Coins, cards, dice, spinners, etc. don’t really work well here to develop a chance model of a 68.8% success rate.But we can still use the magic of an applet. (While this will be a different applet than the first two we used, it is essentially the same.)
46Ron Artest Continued So we have moderate evidence against the null. Let’s see what would happen if we had more data.Suppose he continued to shoot 46.7% from the free throw line so that he made 7 out of 15 of his next attempts as well for a total of 14 out of 30. Let’s return to the applet to see how our p-value would change.
47Ron Artest ContinuedAs the sample size increases, there is less variability in our null distribution.It is still centered around 0.688, but its width becomes more and more narrow.As a result, gets further and further out in the tail and thus the p-value gets smaller.This should make intuitive sense in that with a larger sample size, we have more evidence.
48Ron Artest ContinuedBesides a larger sample size, how else could we get more evidence against the null?Artest could make fewer shots.Is that what really happened?No. Artest made 4 of his next 5 shots for a total of 11 out of 20 (55%) for the playoffs.Let’s return to the applet and see how this changes our p-value.
49Exploration 1.4 Shaky Putting? Phil Mickelson is one of the best golfers in the world. He’s won the Masters Tournament three times.However, 2011 was not his best year. He seemed to struggle with his putting and switched to a “belly putter” late in the year.
50Exploration 1.4 Was Mickelson a poor putter in 2011? In this exploration, you will compare Mickelson’s 2011 record of putting from 10 feet away from the hole with that of all other professional golfers that year.Was he significantly worse than his peers?
55Helper or Hinderer?Sixteen babies were shown the two demonstrations. One helper toy and one hinderer toy. Which toy used and the order was random.When presented with the two toys (randomly which was to the left and which to the right) 14 of the babies chose the helper toy.How is this experiment different than any we have looked at so far?
56Helper or Hinderer?The key difference is that each attempt was made by a different baby.Our chance model implies that each baby has the same chance of choosing the helper toy (50%).It could be that some babies randomly choose and some do not. We will talk about this in our conclusion.Let’s run the test.
57Helper or Hinderer?Null Hypothesis: Each baby is randomly choosing one of two toys. (The babies choose the helper toy 50% of the time in the long run.)Alternative Hypothesis: The babies are not randomly choosing, but show a preference for the helper toy. (The babies choose the helper toy more than 50% of the time in the long run.)We can use any applet to test this. Remember that our sample proportion is 14 out of 16.
58Helper or Hinderer? So what can we conclude? Do all the babies prefer the helper toy?Do some of the babies prefer the helper toy?Because we had a low p-value, we can conclude that not all the babies are randomly choosing and that at least some of them prefer the helper toy.Can we make conclusions beyond these 16 babies?
59Which Tire?Two students miss a chemistry exam because of excessive partying, but blame their absence on a flat tire.The professor allowed them to take a make-up exam, and he sent them to separate rooms to take it.The first question, worth 5 points, was quite easy.The second question, worth 95 points, asked: Which tire was flat?
60Which Tire? How would you answer this question? Passenger’s Driver’s side frontDriver’sside frontDriver’sside rearPassenger’sside rear
61Exploration 1.5: Tire Story Falls Flat We will use the data from class to determine if students have a preference for picking one of the four tires.This is similar to the helper-hinderer example because our observational units are different people.Let’s work exploration 1.5 (page 50).