Presentation on theme: "The Lady Tasting Coffee: A Case Study in Experimental Design"— Presentation transcript:
1 The Lady Tasting Coffee: A Case Study in Experimental Design Pictures are from various statistics and biology books that cite or mention Fisher.
2 The History of Experimentation Experimentation characterizes modern science. Galileo ( ) reportedly dropped balls of various masses from the Leaning Tower of Pisa. Assuming the story of Galileo’s Pisa experiment is true: How many balls did he drop? How many times did he repeat the comparison? What were his independent and dependent variables? How did he measure the time to impact? We don’t know the answers to these questions… Take Home Message: Experimental design was haphazard prior to the 1920’s.Information on the picture: Picture Name: pisa1.jpg Photographer: Clara Jackson Caption: The Leaning Tower of Pisa Location: Pisa, Italy Date Taken: March 2003 Bibliography: Jackson, Clara. pisa1.jpg. March Pics4Learning. 5 May 2008 <http://pics.tech4learning.com>
3 Ronald Aylmer FisherConsidered by some scientists to be the father of modern statistics .Poor eyesight; did a lot of math in his head without paper or pencil.In 1919, he began working as a statistician at the Rothamsted Agricultural Experiment Station in the United Kingdom.Published many papers and wrote several books on experimental design and evolution.
5 Same field, same treatment, but plant performance is uneven... At Rothamsted, Fisher recognized problems with some of the agricultural experimentsSame field, same treatment, but plant performance is uneven...Fisher’s Solution: Replicate and randomize to spread variation evenly among treatments.Thick GrowthThin GrowthSource of Picture:
6 Lessons Learned at Rothamsted Experiments at Rothamsted prior to Fisher generally involved two fields (containing hundreds of plants), each receiving a treatment. Example: two levels of nitrogen (N) fertilizer Problem: So much variability exists within a field itself that it is difficult or impossible to tease out the effect the treatment.Field withHigh NField withLow NRothamsted Agricultural Experiment Station is and was a household word among agricultural scientists, especially during Fisher’s day.
7 Fisher’s Solution at Rothamsted Old Problematic Design: One large field receiving high nitrogen (N), one large field receiving low nitrogen (N).(Today this design is sometimes called “pseudoreplication” if the experimenter attempts to say that the sample size is the number of plants.)New Improved Design: Many small plots, randomly receiving high N or low N; plots can also be blocked to help tease out the variation due to location and local conditions.Hurlbert, S. H. (1984). Pseudoreplication and the design of ecological field experiments. Ecological monographs 54(2):
8 Examples of Correct & Incorrect Ways to Randomize Treatments Use a random number table.Pick treatments from a hat.Flip a coin.Haphazardly decide which experimental units should receive which treatments. (Problem: too tempting for experimenter to bias.)Use a net to grab the goldfish in an ecology study. (Problem: might pick just the easiest to catch, sickly animals.)Alternate treatments (every other one). (Problem: that’s systematic, not random; who knows what other factors vary in the same systematic way.)Assign people to drug study on the basis of their last name. (Problem: could be related to a person’s ancestry.)
9 Fisher, Randomization, Replication & Blocking No replication (or pseudoreplication) (Rothamsted, pre-Fisher):Replicated with complete randomization:Replicated, randomized and blocked design:Field withLow NField withHigh NTreatments are applied to plots rather than to an entire field; this improves replication & interspersion of treatments.Field broken up into smaller plotsInstructors may wish to eliminate this slide and the subsequent related ones. These slides were included because the ANOVA and the F ratio (“F” for Fisher) are two major contributions of Fisher to the field (yes, the pun is intended!) of statistics.Fisher’s essay about a lady tasting tea may not seem to be related to Analysis of Variance. Yet even this essay mentions sources of variation: p. 1517, “let us imagine all causes of disturbance—” (in other words, all sources of variation) “the strength of the infusion, the quantity of milk, the temperature at which it is tasted, etc.”Fisher’s essay on the lady tasting tea is his attempt to teach people about experimental design by setting up a scenario that is familiar to the average person (a tea party). In the same way that Fisher’s tea party helps us understand experimental design, students will better understand Fisher’s contributions if they understand the context that was familiar to Fisher when he developed his ideas. It may be easier for students to understand the importance of partitioning variation through ANOVA techniques if they envision agricultural experiments that have a strong spatial component.Dashed rectangle is a blockField broken up into smaller plots & plots are grouped.Plots are blocked by location or other condition; treatments are applied randomly to plots within blocks.
10 Another of Fisher’s Contributions to Statistics: The Analysis of Variance (ANOVA) Allows scientists to mathematically partition variation among different sources (treatments, blocks, plots, for example).Some of Fisher’s contributions to the field of statistics grew out of his experience with spatial agricultural experiments at Rothamsted.
11 From: Sokal, Robert R., & F.James Rohlf, Biometry: The Principles and Practice of Statistics in Biological Research, San Francisco: W.H. Freeman.
12 Why do these two plants differ in growth Why do these two plants differ in growth? Is it because of block, treatment, or extraneous variation within plots?At Rothamsted, Fisher saw firsthand that the purpose of good experimental design is not to eliminate variation entirely, but rather to try to ensure that extraneous variation is spread evenly among treatments. In the case of ANOVA, the experimental design can enable the variation to be partitioned mathematically during analysis. Variation in growth of plants can be partitioned into different sources of variation: 1. Variation in soil moisture, texture, etc. within a plot. 2. Variation between treatments (high N and low N). 3. Variation in soil moisture, texture, sunlight, etc., among blocks.In this example there are 6 plants per plot. Treatments are applied to individual plots, not to individual plants. The plot is the experimental unit. # Experimental units (the true sample size) in this experiment is: 4 plots per treatment An example of pseudoreplication would be if an experimenter tried to claim that the experimental unit is the individual plant and that the sample size is 24 plants per treatment.
13 The Design of Experiments (1935) One of the first chapters of this textbook written by Fisher is the essay, “Mathematics of a Lady Tasting Tea.”Note to instructor: In the first chapter of Design of Experiments Fisher named his essay, “The principles of experimentation, illustrated by a psycho-physical experiment” but it is the same essay as above. “Mathematics of a Lady Tasting Tea” is a revised title used by Fisher when his essay was included in a four-volume set called the World of Mathematics. The World of Mathematics is still published by Dover Publishing Company, and contains a variety of interesting historical essays and papers.
15 So, you think statistics is boring . . . Statisticians and the history of statistics are far from boring.Other interesting trivia on Fisher:-Charming but had a terrible temper (and a big ego)-Smoked a pipe & argued professionally in the 1950’s that smoking did not cause cancer-Supported eugenicsSource of picture:Parascandola, M. (2004). "Two approaches to etiology: the debate over smoking and lung cancer in the 1950s." Endeavour 28(2):Short Title: Two approaches to etiology: the debate over smoking and lung cancer in the 1950sURL:Picture taken from:Parascandola, M. (2004). "Two approaches to etiology: the debate over smoking and lung cancer in the 1950s." Endeavour 28(2):
16 Take Home MessagesThe 1920’s was a rich time for the development of concepts of modern experimental design.Fisher was one of a number of statisticians who greatly affected the development of modern statistics.Fisher’s experience at Rothamsted Agricultural Experiment Station influenced his vision of experimental design and helped him develop the concept of ANOVA .Fisher’s essay on a lady tasting tea eloquently outlines some important issues in experimental design.
17 To learn more, read the biographies of statisticians as you learn their techniques The Student’s t-test Student is the pseudonym of William Sealy Gosset, a contemporary of Fisher who worked for Guiness, the Irish brewery. Other techniques Many statistical techniques are named after interesting historical people: Bayes, Bernoulli, Cochran, Cox, Kolmogorov, Mann, Pearson, Smirnov, Tukey, Whitney, Wilcoxon to name just a few You are more likely to remember specific statistical techniques if you know about the people who created them. Don’t be afraid to look at the original works published by these famous statisticians.
18 Examples of statistical techniques or tests named after people important in the history of statistics. Names below include: Cochran, Cox, Friedman, Gosset, Kolmogorov, Kruskal, Mann, Smirnov, Spearman, Wallace, Whitney, & Wilcoxon.
19 Recommended ReadingSalsburg, D The Lady Tasting Tea: How Statistics Revolutionized Science in the Twentieth Century. Henry Holt and Company, NY.Stigler, S. M Statistics on the Table: The History of Statistical Concepts and Methods. Harvard University Press, Cambridge, MA.