Collecting Data in Reasonable Ways

Collecting Data in Reasonable Ways
Chapter 1 Collecting Data in Reasonable Ways Created by Kathy Fritz

Data and conclusions based on data are everywhere
Newspapers Magazines On-line reports Professional publications

Population – the entire collection of individuals or objects that you want to learn about Sample – the part of the population that is selected for study

Population and Sample Population Sample
The distinction between population and sample is basic to statistics. To make sense of any sample result, you must know what population the sample represents Population Collect data from a representative Sample... Sample Make an Inference about the Population.

The Idea of a Sample Survey
We often draw conclusions about a whole population on the basis of a sample. Choosing a sample from a large, varied population is not that easy. Step 1: Step 2: Step 3:

In the article, “The ‘CSI Effect’ – Does It Really Exist
In the article, “The ‘CSI Effect’ – Does It Really Exist?” (National Institute of Justice [2008]: 1-7), the author speculates that watching crime scene investigation TV shows may be associated with the kind of high-tech evidence that jurors expect to see in criminal trials. Do people who watch such shows on a regular basis have higher expectations than those who do not watch them? Observational study – The goal of an observational study is to

Suppose a chemistry teacher wants to see the effects on students’ test scores if the lab time were increased from 3 hours to 6 hours. Experiment -

A big difference between an experiment and an observational study is . . .
in an experiment, the person carrying out the study in an observational study,

Observational Studies
Purpose is to collect data that will allow you to learn about a single population or about how two or more populations will differ Allows you to answer questions like

Census versus Sample Why might we prefer to take select a sample rather than perform a census? Measurements that require destroying the item Difficult to find entire population Limited resources

When you answer questions like
“What is the proportion of . . .?” “What is the average of . . .?” You are interested in the population characteristic. A population characteristic is a number that describes the entire population. A statistic is a number that describes a sample. One way is to take a simple random sample. How can we be sure that the sample is representative of the population? It is important that a sample be representative of the population.

How to Sample Well: Random Sampling
The statistician’s remedy is to allow impersonal chance to choose the sample. A sample chosen by chance rules out both favoritism by the sampler and self-selection by respondents. Random sampling, the use of chance to select a sample, is the central principle of statistical sampling. Definition: A simple random sample (SRS) In practice, people use random numbers generated by a computer or calculator to choose samples. If you don’t have technology handy, you can use a table of random digits.

Simple Random Sample A sample of size n is selected from the population in a way that ensures that Suppose you want to select 10 employees from all employees of a large design firm. Number each employee with a unique number. Use a random digit table random, a random number generator, or numbers selected from a hat to select the 10 employees for the sample.

How to Choose an SRS Using Table D
Definition: A table of random digits is a long string of the digits 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 with these properties: • Each entry in the table is equally likely to be any of the 10 digits • The entries are independent of each other. That is, knowledge of one part of the table gives no information about any other part. Step 1: Label. Give each member of the population a numerical label of the same length. Step 2: Table. Read consecutive groups of digits of the appropriate length from Table D. Your sample contains the individuals whose labels you find. How to Choose an SRS Using Table D

How to use a Random digit table
The following is part of the random digit table found in the back of your textbook: Row 6 9 3 8 7 5 2 4 1

Sampling with replacement
Sampling in which an individual or object, This allows an object or individual to be selected more than once for a sample.

Sampling without replacement
Sampling in which an individual or object, That is once an individual or object is selected, they are not replaced and cannot be selected again.

Can a smaller sample size be representative of the population?
Samples of varying sizes Notice that the sample of size 50 is still representative of the population.

Other Sampling Methods
The basic idea of sampling is straightforward: take an SRS from the population and use your sample results to gain information about the population. Sometimes there are statistical advantages to using more complex sampling methods. One common alternative to an SRS involves sampling important groups (called strata) within the population separately. These “sub-samples” are combined to form one stratified random sample. Definition: To select a stratified random sample,

Stratified Random Sample
Suppose we wanted to estimate the average cost of malpractice insurance for doctors in a particular city. We could view the population of all doctors in this city as falling into one of four subgroups: (1) surgeons, (2) internists and family practitioners, (3) obstetricians, and (4) all other doctors.

Cluster Sampling Cluster sampling
Clusters are then selected at random, Suppose that a large urban high school has 600 senior students, all of whom are enrolled in a first period homeroom. There are 24 senior homerooms, each with approximately 25 students. The school administrators want to select a sample of roughly 75 seniors to participate in a survey.

Stratification Clustering
Be careful not to confuse clustering and stratification! Stratification Clustering Sample

Example: Sampling at a School Assembly
Describe how you would use the following sampling methods to select 80 students to complete a survey. (a) Simple Random Sample (b) Stratified Random Sample (c) Cluster Sample

1 in k Systematic Sampling
Selects an ordered arrangement from a population by Suppose you wish to select a sample of faculty members from the faculty phone directory. You would first randomly select a faculty from the first 20 (k = 20) faculty listed in the directory. Then select every 20th faculty after that on the list.

How to Sample Badly How can we choose a sample that we can trust to represent the population? There are a number of different methods to select samples. Definition: Choosing individuals who are easiest to reach results in a convenience sample. Convenience samples often produce unrepresentative data…why? Definition: The design of a statistical study shows bias if it systematically favors certain outcomes.

Convenience Sampling Selecting individuals or objects that are easy or convenient to sample. Suppose your statistics professor asked you to gather a sample of 20 students from your college. You survey 20 students in your next class which is music theory. Will this sample be representative of the population of all students at your college? Why or why not?

Voluntary response is a type of convenience sampling

Identify the sampling method
1)The Educational Testing Service (ETS) needed a sample of colleges. ETS first divided all colleges into 6 subgroups of similar types (small public, small private, medium public, medium private, large public, and large private). Then they randomly selected 3 colleges from each group.

Identify the sampling design
2) A county commissioner wants to survey people in her district to determine their opinions on a particular law up for adoption. She decides to randomly select blocks in her district and then survey all who live on those blocks.

Identify the sampling design
3) A local restaurant manager wants to survey customers about the service they receive. Each night the manager randomly chooses a number between 1 & He then gives a survey to that customer, and to every 10th customer after them, to fill it out before they leave.

Consider the following example: In 1936, Franklin Delano Roosevelt had been President for one term. The magazine, The Literary Digest, predicted that Alf Landon would beat FDR in that year's election by 57 to 43 percent. The Digest mailed over 10 million questionnaires to names drawn from lists of automobile and telephone owners, and over 2.3 million people responded - a huge sample. At the same time, a young man named George Gallup sampled only 50,000 people and predicted that Roosevelt would win. Gallup's prediction was ridiculed as naive. After all, the Digest had predicted the winner in every election since 1916, and had based its predictions on the largest response to any poll in history. But Roosevelt won with 62% of the vote. The size of the Digest's error is staggering.

Sources of bias Selection bias

Sources of bias Measurement or Response bias
Improperly calibrated scale is used to weigh items Tendency of people not to be completely honest when asked about illegal behavior or unpopular beliefs Appearance or behavior of the person asking the questions Questions on a survey are worded in a way that tends to influence the response

Sources of bias Nonresponse

Sources of bias Will increasing the sample size reduce the effects of bias in the study?

Identify one potential source of bias.
Suppose that you want to estimate the total amount of money spent by students on textbooks each semester at a local college. You collect register receipts for students as they leave the bookstore during lunch one day.

Identify one potential source of bias.
To find the average value of a home in Plano, one averages the price of homes that are listed for sale with a realtor.

The article “What People Buy from Fast-Food Restaurants: Caloric Content and Menu Item Selection” (Obesity [2009]: ) reported that the average number of calories consumed at lunch in New Your City fast- food restaurants was 827. The researchers selected 267 fast-food locations at random. The paper states that at each of these locations “adult” customers were approached as they entered the restaurant and asked to provide their food receipt when exiting and to complete a brief survey. Will this study result in data that is representative of the population?

Question Response What is the population of interest?
Was the sample selected in a reasonable way? Is the sample likely to be representative of the population of interest? Are there any obvious sources of bias?

Inference for Sampling
The purpose of a sample is to give us information about a larger population. The process of drawing conclusions about a population on the basis of sample data is called inference. Why should we rely on random sampling? To eliminate bias in selecting samples from the list of available individuals. The laws of probability allow trustworthy inference about the population Results from random samples come with a margin of error that sets bounds on the size of the likely error. Larger random samples give better information about the population than smaller samples.

Experimental Design

How to Experiment Badly
Many laboratory experiments use a very simple design: Experimental Units Treatment Measure Response In the lab environment, simple designs often work well. Field experiments and experiments with animals or people deal with more variable conditions. Outside the lab, badly designed experiments often yield worthless results because of confounding.

Experiments Suppose we are interested in determining the effect of room temperature on the performance on a first-semester calculus exam. So we decide to perform an experiment. What variable will we “measure”? What variable will “explain” the results on the calculus exam?

The Language of Experiments
An experiment is a statistical study in which we actually do something (a treatment) to people, animals, or objects (the experimental units) to observe the response. Here is the basic vocabulary of experiments.

Room temperature experiment continued . . .
We decide to use two temperature settings, 65° and 75°. How many treatments would our experiment have?

Suppose we have 10 sections of first-semester calculus that have agree to participate in our study. On who or what will we impose the treatments? Should the instructors of these sections be allowed to select to which room temperature that their sections are assigned?

Observational Study vs. Experiment
Observational studies of the effect of an explanatory variable on a response variable often fail because of confounding between the explanatory variable and one or more other variables. Well-designed experiments take steps to prevent confounding.

Designing Strategies for Single Comparative Experiments
The goal of a single comparative experiment is to determine the effects of the treatment on the response variable. To do this: You must consider other potential sources of variability in the response

How to Experiment Well The remedy for confounding is to perform a comparative experiment in which some units receive one treatment and similar units receive another. Most well designed experiments compare two or more treatments. Comparison alone isn’t enough, if the treatments are given to groups that differ greatly, bias will result. The solution to the problem of bias is random assignment.

Principles of Experimental Design
The basic principles for designing experiments are as follows: Comparison. Random assignment. Control. 4. Replication. Randomized comparative experiments are designed to give good evidence that differences in the treatments actually cause the differences we see in the response.

Remember – the explanatory variable is the room temperature setting, 65° and 75°. The response variable is the grade on the calculus exam. Are there other variables that could affect the response? Students should answer with things like time of day, instructor, textbook, ability level of students in the sections Instructor? Textbook? Time of day? Ability level of students?

Remember – the explanatory variable is the room temperature setting, 65° and 75°. The response variable is the grade on the calculus exam. The experimenter cannot control who the instructors are. Therefore, the instructors may be potentially confounding. Students should answer with things like time of day, instructor, textbook, ability level of students in the sections Instructor? Textbook? Time of day? Ability level of students?

What about other variables that we cannot control directly or that we don’t even think about?

To randomly assign the 10 sections of first-semester calculus to the 2 treatment groups, we would first number the classes 1-10. Sections assigned Treatment 1 (65°) Treatment 2 (75°)

What to do with Potentially Confounding Variables
Potentially confounding variables that are known and incorporated into the experimental design: Use: Direct control – Blocking – Potentially confounding variables that are NOT known or NOT incorporated into the experimental design: Use: Random assignment

Types of Experimental Design
Completely Randomized Design Treatment A Measure response for A Experimental Units Random Assignment Compare treatments Treatment B Measure response for B

Blocking When a population consists of groups of individuals that are “similar within but different between,” a stratified random sample gives a better estimate than a simple random sample. This same logic applies in experiments. Completely randomized designs are the simplest statistical designs for experiments. But just as with sampling, there are times when the simplest method doesn’t yield the most precise results. A block In a randomized block design

Randomized Block Design
Treatment A Measure response for A Block 1 Random Assignment Compare treatments for block 1 Treatment B Measure response for B Compare the results from the 2 blocks Experimental Units Create blocks Treatment A Measure response for A Block 2 Random Assignment Compare treatments for block 2 Treatment B Measure response for B

Matched Pairs Design A common type of randomized block design for comparing two treatments is a matched pairs design. The idea is to create blocks by matching pairs of similar experimental units. A matched pairs design Chance is used to determine which unit in each pair gets each treatment. Sometimes, a “pair” in a matched-pairs design consists of a single unit that receives both treatments. Since the order of the treatments can influence the response, chance is used to determine with treatment is applied first for each unit.

Can moving their hands help children learn math?
An experiment was conducted to compare two different methods for teaching children how to solve math problems of the form = ___ + 8. One method involved having students point to the on the left side of the equal sign with one hand and then point to the blank on the right side of the equal sign before filling in the blank to complete the equation. The other method did not involve using these had gestures. To compare the two methods, 128 children, ages 9 and 10, were randomly assigned to the two experimental conditions.

A Diagram of the math experiment:
This is a completely randomized experiment. The 128 children are randomly assigned into the two treatment groups. Math with hand gestures Measure number correct on math test Compare number correct for those who used hand gestures and those who did not 128 children Random Assignment Math without hand gestures Measure number correct on math test

Can moving their hands help children learn math?
Suppose that you were worried that gender might also be related to performance on the math test. One possibility would be to use direct control of gender – use only boys or only girls. But, any conclusions can ONLY be generalized to the group that was used. Another strategy is to incorporate blocking into the design. The researchers could create two blocks, one consisting of girls and one consisting of boys. Then, once the blocks are formed, randomly assign the girls to the two treatments and randomly assign the boys to the two treatments. This is an example of a

A Diagram of the math experiment:
The 128 students are blocked by gender. Then the students are randomly assigned to the two treatments. Math with hand gestires Measure number correct Compare number correct with hand gestures and without hand gestures for girls 81 girls Random Assignment Math with no hand gestures Measure number correct Compare number correct with hand gestures and without hand gestures 128 Children Create blocks Math with hand gestrues Measure number correct Compare number correct with hand gestures and without hand gestures for boys 47 boys Random Assignment Math with no hand gestures Measure number correct

Experiments: What Can Go Wrong?
The logic of a randomized comparative experiment depends on our ability to treat all the subjects the same in every way except for the actual treatments being compared. Good experiments, therefore, require careful attention to details to ensure that all subjects really are treated identically. The response to a dummy treatment is called the In a double-blind experiment,

Revisit the room temperature experiment . . .
In the room temperature experiment, we have only 2 treatment groups, 65° and 75°. We do NOT have a control group. A control group

Consider Anna, a waitress
Consider Anna, a waitress. She decides to perform an experiment to determine if writing “Thank you” on the receipt increases her tip percentage. She plans on having two groups. On one group she will write “Thank you” on the receipt and on the other group she will not write “Thank you” on the receipt.

Would the students in each section of calculus know to which treatment group, 65° or 75°, they were assigned? If the students knew about the experiment, they would probably know which treatment group they were in.

Using Volunteers as Subjects in an Experiment
Remember – Using volunteers in observational studies is never a good idea! However – It is common practice to use volunteers as subjects in an experiment. Random assignment of the volunteers to treatments allows for cause-and-effect conclusions But, limits the ability to generalize to population

The ONLY way to show a cause-effect relationship is with a well-designed, well-controlled experiment!!!

Chilling newborns? Let’s examine this experiment.
Researchers for the National Institute of Child Health and Human Development studied 208 infants whose brains were temporarily deprived of oxygen as a result of complications at birth (The New England Journal of Medicine, October 13, 2005). These babies were subjects in an experiment to determine if reducing body temperature for three days after birth improved their chances of surviving without brain damage. The experiment was summarized in a paper that stated “infants were randomly assigned to usual care (control group) or whole-body cooling.” The researchers recorded whether the infant survived (yes/no) and if the infant had brain damage (yes/no). Let’s examine this experiment.

Question Response What question is the experiment trying to answer? What are the experimental conditions (treatments) for the experiment? What is the response? What are the experimental units and how were they selected?

Table Continued . . . Question Response
Does the design incorporate random assignment of experimental units to the different experimental conditions? If not, are there potentially confounding variables that would limit the use of these data? Does the experiment incorporate a control group and/or a placebo group? If not, would the experiment be improved by including them? Does the experiment involve blinding? If not, would the experiment be improved by making it single- or double-blind?

Spanking Lowers a Child’s IQ (Los Angeles Times)
Spanking can lower IQ (NBC4i, Columbus, Ohio) Smacking hits kids’ IQ (newscientist.com) In this study, the investigators followed 806 kids ages 2 to 4 and 704 kids ages 5 to 9 for 4 years. IQ was measured at the beginning of the study and again 4 years later. The researchers found that at the end of the study, the average IQ of the younger kids who were not spanked was 5 points higher than that of kids who were spanked. For the older group, the average IQ of kids who were not spanked was 2.8 points higher.

Drawing Conclusions from Statistical Studies
Type of Conclusion Reasonable When Generalize from sample to population Random selection is used to obtain the sample Change in response is caused by experimental conditions (cause-and-effect conclusion) There is random assignment of experimental units to experimental conditions This table implies the following: For observational studies, it is not possible to reach cause-and-effect conclusions, but it is possible to generalize from the sample to the population of interest if the study design incorporated random selection. For experiments, it is possible to reach cause-and-effect conclusions if the study design incorporated random assignment to experimental conditions. If an experiment incorporates both random assignment to experimental conditions and random selection of experimental units from some population, it is possible to reach cause-and-effect conclusions and to generalize to the population from which the experimental units were selected.

Inference for Experiments
In an experiment, researchers usually hope to see a difference in the responses so large that it is unlikely to happen just because of chance variation. We can use the laws of probability, which describe chance behavior, to learn whether the treatment effects are larger than we would expect to see if only chance were operating. If they are, we call them statistically significant. An observed effect so large that it would rarely occur by chance is called statistically significant. A statistically significant association in data from a well-designed experiment does imply causation.

Revisit . . . Spanking Lowers a Child’s IQ (Los Angeles Times) Spanking can lower IQ (NBC4i, Columbus, Ohio) Smacking hits kids’ IQ (newscientist.com) In this study, the investigators followed 806 kids ages 2 to 4 and 704 kids ages 5 to 9 for 4 years. IQ was measured at the beginning of the study and again 4 years later. The researchers found that at the end of the study, the average IQ of the younger kids who were not spanked was 5 points higher than that of kids who were spanked. For the older group, the average IQ of kids who were not spanked was 2.8 points higher.

Scope of Inference What type of inference can be made from a particular study? The answer depends on the design of the study. Well-designed experiments randomly assign individuals to treatment groups. However, most experiments don’t select experimental units at random from the larger population. That limits such experiments to inference about cause and effect. Observational studies don’t randomly assign individuals to groups, which rules out inference about cause and effect. Observational studies that use random sampling can make inferences about the population.

The Challenges of Establishing Causation
A well-designed experiment tells us that changes in the explanatory variable cause changes in the response variable. Lack of realism can limit our ability to apply the conclusions of an experiment to the settings of greatest interest. In some cases it isn’t practical or ethical to do an experiment. Consider these questions: Does texting while driving increase the risk of having an accident? Does going to church regularly help people live longer? Does smoking cause lung cancer? It is sometimes possible to build a strong case for causation in the absence of experiments by considering data from observational studies.

The Challenges of Establishing Causation
When we can’t do an experiment, we can use the following criteria for establishing causation. The association is strong. The association is consistent. Larger values of the explanatory variable are associated with stronger responses. The alleged cause precedes the effect in time. The alleged cause is plausible.

Data Ethics Complex issues of data ethics arise when we collect data from people. Here are some basic standards of data ethics that must be obeyed by all studies that gather data from human subjects, both observational studies and experiments. Basic Data Ethics All planned studies must be reviewed in advance by an institutional review board charged with protecting the safety and well-being of the subjects. All individuals who are subjects in a study must give their informed consent before data are collected. All individual data must be kept confidential. Only statistical summaries for groups of subjects may be made public.

Avoid These Common Mistakes

Drawing a cause-and-effect conclusion from an observational study.

Generalizing results of an experiment that uses volunteers as subjects. Only do this if it can be convincingly argued that the group of volunteers is representative of the population of interest.

Generalizing conclusions based on data from a poorly designed observational study. Generalizing from a sample to a population is justified only when the sample is representative of the population. This would be the case if the sample was a random sample from the population, and there were no major potential sources of bias.

Generalizing conclusions based on an observational study that used voluntary response or convenience sampling to a larger population. This is almost never reasonable!

Collecting Data in Reasonable Ways

Similar presentations

Presentation on theme: "Collecting Data in Reasonable Ways"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Collecting Data in Reasonable Ways

Similar presentations

Presentation on theme: "Collecting Data in Reasonable Ways"— Presentation transcript:

Similar presentations

About project

Feedback