Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 4: Gathering Data

Similar presentations


Presentation on theme: "Chapter 4: Gathering Data"— Presentation transcript:

1 Chapter 4: Gathering Data
Section 4.1 Should We Experiment or Should We Merely Observe?

2 Learning Objectives: Population versus Sample
Types of Studies: Experimental and Observational Comparing Experimental and Observational Studies

3 Learning Objective 1: Population and Sample
Population: all the subjects of interest We use statistics to learn about the population, the entire group of interest Sample: subset of the population Data is collected for the sample because we cannot typically measure all subjects in the population Population Sample

4 Learning Objective 2: Type of Study: Observational Study
In an observational study, the researcher observes values of the response variable and explanatory variables for the sampled subjects, without anything being done to the subjects (such as imposing a treatment)

5 Learning Objective 2: Observational Study – Sample Survey
A sample survey selects a sample of people from a population and interviews them to collect data. A sample survey is a type of observational study. A census is a survey that attempts to count the number of people in the population and to measure certain characteristics about them

6 Learning Objective 2: Type of Study: Experiment
A researcher conducts an experiment by assigning subjects to certain experimental conditions and then observing outcomes on the response variable The experimental conditions, which correspond to assigned values of the explanatory variable, are called treatments

7 Learning Objective 2: Example
Headline: “Student Drug Testing Not Effective in Reducing Drug Use” Facts about the study: 76,000 students nationwide Schools selected for the study included schools that tested for drugs and schools that did not test for drugs Each student filled out a questionnaire asking about his/her drug use

8 Learning Objective 2: Example
Conclusion: Drug use was similar in schools that tested for drugs and schools that did not test for drugs

9 Learning Objective 2: Example
This study was an observational study. In order for it to be an experiment, the researcher would had to have assigned each school to use or not use drug testing rather than leaving this decision to the school.

10 Learning Objective 3: Comparing Experiments and Observational Studies
An experiment reduces the potential for lurking variables to affect the result. Thus, an experiment gives the researcher more control over outside influences. Only an experiment can establish cause and effect. Observational studies can not. Experiments are not always possible due to ethical reasons, time considerations and other factors.

11 Chapter 4 Gathering Data
Section 4.2 What are Good Ways and Poor Ways to Sample?

12 Learning Objectives: Sampling Frame & Sampling Design
Simple Random Sample (SRS) Random number table Margin of Error Convenience Samples Types of Bias in Sample Surveys Key Parts of a Sample Survey

13 Learning Objective 1: Sampling Frame & Sampling Design
The sampling frame is the list of subjects in the population from which the sample is taken, ideally it lists the entire population of interest The sampling design determines how the sample is selected. Ideally, it should give each subject an equal chance of being selected to be in the sample

14 Learning Objective 2: Simple Random Sampling, SRS
Random Sampling is the best way of obtaining a sample that is representative of the population A simple random sample of ‘n’ subjects from a population is one in which each possible sample of that size has the same chance of being selected

15 Learning Objective 2: SRS Example
Two club officers are to be chosen for a New Orleans trip There are 5 officers: President, Vice-President, Secretary, Treasurer and Activity Coordinator The 10 possible samples are: (P,V) (P,S) (P,T) (P,A) (V,S) (V,T) (V,A) (S,T) (S,A) (T,A) For a SRS, each of the ten possible samples has an equal chance of being selected. Thus, each sample has a 1 in 10 chance of being selected and each officer has a 1 in 4 chance of being selected.

16 Learning Objective 3: SRS: Table of Random Numbers
Table E on pg. A6 of text

17 Leaning Objective 3: Using Random Numbers to select a SRS
To select a simple random sample Number the subjects in the sampling frame using numbers of the same length (number of digits) Select numbers of that length from a table of random numbers or using a random number generator Include in the sample those subjects having numbers equal to the random numbers selected

18 Learning Objective 3: Choosing a simple random sample
We need to select a random sample of 5 from a class of 20 students. List and number all members of the population, which is the class of 20. The number 20 is two-digits long. Parse the list of random digits into numbers that are two digits long. Here we chose to start with line 103, for no particular reason.

19 1 Alison 2 Amy 3 Brigitte 4 Darwin 5 Emily 6 Fernando 7 George 8 Harry 9 Henry 10 John 11 Kate 12 Max 13 Moe 14 Nancy 15 Ned 16 Paul 17 Ramon 18 Rupert 19 Tom 20 Victoria Choose a random sample of size 5 by reading through the list of two-digit random numbers, starting with line 2 and on. The first five random numbers matching numbers assigned to people make the SRS. The first individual selected is Amy, number 02. That’s it from line 2. Move to line 3 Then Moe (13), Darwin, (04), Henry (09), and Net (15) Remember that 1 is 01, 2 is 02, etc. If you were to hit 09 again before getting five people, don’t sample Ramon twice—you just keep going.

20 Learning Objective 4: Margin of Error
Sample surveys are commonly used to estimate population percentages These estimates include a margin of error which tells us how well the sample estimate predicts the population percentage When a SRS of n subjects is used, the margin of error is approximately

21 Learning Objective 4: Example: Margin of Error
A survey result states: “The margin of error is plus or minus 3 percentage points” This means: “It is very likely that the reported sample percentage is no more than 3% lower or 3% higher than the population percentage”

22 Learning Objective 5: Convenience Samples: Poor Ways to Sample
Convenience Sample: a type of survey sample that is easy to obtain Unlikely to be representative of the population Often severe biases result from such a sample Results apply ONLY to the observed subjects

23 Learning Objective 5: Convenience Samples: Poor Ways to Sample
Volunteer Sample: most common form of convenience sample Subjects volunteer for the sample Volunteers do not tend to be representative of the entire population

24 Learning Objective 6: Types of Bias in Sample Surveys
Bias: Tendency to systematically favor certain parts of the population over others Sampling Bias: bias resulting from the sampling method such as using nonrandom samples or having undercoverage Nonresponse bias: occurs when some sampled subjects cannot be reached or refuse to participate or fail to answer some questions Response bias: occurs when the subject gives an incorrect response or the question is misleading A Large Sample Does Not Guarantee An Unbiased Sample!

25 Learning Objective 7: Key Parts of a Sample Survey
Identify the population of all subjects of interest Construct a sampling frame which attempts to list all subjects in the population Use a random sampling design to select n subjects from the sampling frame Be cautious of sampling bias due to nonrandom samples We can make inferences about the population of interest when sample surveys that use random sampling are employed.

26 Chapter 4 Gathering Data
Section 4.3 What Are Good Ways and Poor Ways to Experiment?

27 Learning Objectives: Identify the elements of an experiment
Experiments 3 Components of a good experiment Blinding the Study Define Statistical Significance Generalizing Results of the Study

28 Learning Objective 1: Elements of an Experiment
Experimental units: the subjects of an experiment; the entities that we measure in an experiment Treatment: A specific experimental condition imposed on the subjects of the study; the treatments correspond to assigned values of the explanatory variable Explanatory variable: Defines the groups to be compared with respect to values on the response variable Response variable: The outcome measured on the subjects to reveal the effect of the treatment(s).

29 Learning Objective 2: Experiments
An experiment deliberately imposes treatments on the experimental units in order to observe their responses. The goal of an experiment is to compare the effect of the treatment on the response. Experiments that are randomized occur when the subjects are randomly assigned to the treatments; randomization helps to eliminate the effects of lurking variables

30 Learning Objective 3: 3 Components of a Good Experiment
Control/Comparison group: allows the researcher to analyze the effectiveness of the primary treatment Randomization: eliminates possible researcher bias, balances the comparison groups on known as well as on lurking variables Replication: allows us to attribute observed effects to the treatments rather than ordinary variability

31 Learning Objective 3: Principle 1: Control or Comparison Group
A placebo is a dummy treatment, i.e. sugar pill. Many subjects respond favorable to any treatment, even a placebo. A control group typically receives a placebo. A control group allows us the analyze the effectiveness of the primary treatment. A control group need not receive a placebo. Clinical trials often compare a new treatment for a medical condition, not with a placebo, but with a treatment that is already on the market.

32 Learning Objective 3: Principle 1: Control or Comparison Group
Experiments should compare treatments rather than attempt to assess the effect of a single treatment in isolation Is the treatment group better, worse, or no different than the control group? Example: 400 volunteers are asked to quit smoking and each start taking an antidepressant. In 1 year, how many have relapsed? Without a control group (individuals who are not on the antidepressant), it is not possible to gauge the effectiveness of the antidepressant.

33 Learning Objective 3: Placebo effect
Placebo effect (power of suggestion) The “placebo effect” is an improvement in health due not to any treatment but only to the patient’s belief that he or she will improve.

34 Learning Objective 3: Principle 2: Randomization
To have confidence in our results we should randomly assign subjects to the treatments. In doing so, we Eliminate bias that may result from the researcher assigning the subjects Balance the groups on variables known to affect the response Balance the groups on lurking variables that may be unknown to the researcher

35 Learning Objective 3: Principle 3: Replication
Replication is the process of assigning several experimental units to each treatment The difference due to ordinary variation is smaller with larger samples We have more confidence that the sample results reflect a true difference due to treatments when the sample size is large Since it is always possible that the observed effects were due to chance alone, replicating the experiment also builds confidence in our conclusions

36 Learning Objective 4: Blinding the Experiment
Ideally, subjects are unaware, or blind, to the treatment they are receiving If an experiment is conducted in such a way that neither the subjects nor the investigators working with them know which treatment each subject is receiving, then the experiment is double-blinded A double-blinded experiment controls response bias from the respondent and experimenter

37 Learning Objective 5: Define Statistical Significance
If an experiment (or other study) finds a difference in two (or more) groups, is this difference really important? If the observed difference is larger than what would be expected just by chance, then it is labeled statistically significant. Rather than relying solely on the label of statistical significance, also look at the actual results to determine if they are practically significant.

38 Learning Objective 6: Generalizing Results
Recall that the goal of experimentation is to analyze the association between the treatment and the response for the population, not just the sample However, care should be taken to generalize the results of a study only to the population that is represented by the study.

39 Chapter 4 Gathering Data
Section 4.4 What are Other Ways to Conduct Experimental and Observational Studies

40 Learning Objectives Sample Surveys: Other Random Sampling Designs
Types of Observational Studies: Prospective and Retrospective Multifactor Experiment Matched pairs design Randomized block design

41 Learning Objective 1: Sample Surveys: Random Sampling Designs
It is not always possible to conduct an experiment so it is necessary to have well designed, informative studies that are not experimental, e.g., sample surveys that use randomization Simple Random Sampling Cluster Sampling Stratified Random Sampling

42 Learning Objective 1: Sample Surveys: Cluster Random Sample
Steps Divide the population into a large number of clusters, such as city blocks Select a simple random sample of the clusters Use the subjects in those clusters as the sample

43 Learning Objective 1: Sample Surveys: Cluster Random Sample
Preferable when A reliable sampling frame is unavailable The cost of selecting a SRS is excessive Disadvantage Usually need a larger sample size than with a SRS in order to achieve a particular margin of error

44 Learning Objective 1: Sample Surveys: Stratified Random Sample
Steps Divide the population into separate groups, called strata Select a simple random sample from each strata Combine the samples from all strata to form complete sample

45 Learning Objective 1: Sample Surveys: Stratified Random Sample
Advantage is that you can include in your sample enough subjects in each stratum you want to evaluate Disadvantage is that you must have a sampling frame and know the stratum into which each subject belongs

46 Learning Objective 1: Stratified Random Sample - Example
Suppose a university has the following student demographics: Undergraduate Graduate First Professional Special 55% % % % In order to insure proper coverage of each demographic, a stratified random sample of 100 students could be chosen as follows: select a SRS of 55 undergraduates, a SRS of 20 graduates, a SRS of 5 first professional students, and a SRS of 20 special students; combine these 100 students.

47 Learning Objective 1: Comparing Random Sampling Methods

48 Learning Objective 2: Types of Observational Studies
An observational study can yield useful information when an experiment is not practical. Types of observational studies: Sample Survey: attempts to take a cross section of a population at the current time Retrospective study: looks into the past Prospective study: follows its subjects into the future Causation can never be definitively established with an observational study, but well designed studies can provide supporting evidence for the researcher’s beliefs

49 Learning Objective 2: Retrospective Case-Control Study
A case-control study is a retrospective observational study in which subjects who have a response outcome of interest (the cases) and subjects who have the other response outcome (the controls) are compared on an explanatory variable

50 Learning Objective 2: Example: Case-Control Study
Response outcome of interest: Lung cancer The cases have lung cancer The controls did not have lung cancer The two groups were compared on the explanatory variable smoker/nonsmoker

51 Learning Objective 2: Example: Prospective Study
Nurses’ Health Study: Began in 1976 with 121,700 female nurses aged 30 to 55; questionnaires are filled out every two years Purpose was to explore the relationships among diet, hormonal factors, smoking habits and exercise habits and the risk of coronary heart disease, pulmonary disease and stroke Nurses are followed into the future to determine whether they eventually develop an outcome such as lung cancer and whether certain explanatory variables are associated with it

52 Learning Objective 3: Multifactor Experiments
A Multifactor experiment uses a single experiment to analyze the effects of two or more explanatory variables on the response Categorical explanatory variables in an experiment are often called factors We are often able to learn more from a multifactor experiment than from separate one-factor experiments since the response may vary for different factor combinations

53 Learning Objective 3: Example: Multifactor experiment
Examine the effectiveness of both Zyban and nicotine patches on quitting smoking Two factor experiment 4 treatments

54 Learning Objective 3: Example: Multifactor experiment
subjects: a certain number of undergraduate students all subjects viewed a 40-minute television program that included ads for a digital camera some subjects saw a 30-second commercial; others saw a 90-second version same commercial was shown either 1, 3, or 5 times during the program there were two factors: length of the commercial (2 values), and number of repetitions (3 values)

55 Learning Objective 3: Example: Multifactor experiment
the 6 combinations of one value of each factor form six treatments Factor B: Repetitions 1 time 3 times 5 times Factor A: Length 30 seconds 1 2 3 90 seconds 4 5 6 subjects assigned to Treatment 3 see a 30-second ad five times during the program after viewing, all subjects answered questions about: recall of the ad, their attitude toward the commercial, and their intention to purchase the product – these were the response variables.

56 Learning Objective 4: Matched Pairs Design
In a matched pairs design, the subjects receiving the two treatments are somehow matched (same person, husband/wife, two plots in the same field, etc.) In a crossover design, the same individual is used for the two treatments Randomly assign the two treatments to the two matched subjects, or randomize the order of applying the treatments in a crossover design The number of replicates equals the number of pairs Helps to reduce effects of lurking variables

57 Learning Objective 5: Randomized Block Design
A block is a set of experimental units that are matched with respect to one or more characteristics A Randomized Block Design, RBD, is when the random assignment of experimental units to treatments is carried out separately within each block

58 Learning Objective 5: Example: Randomized Block Design
Block = gender; 3 treatments = 3 types of therapy The men (as well as the women) are randomly assigned to the 3 treatments; differences can be compared with respect to gender as well as therapy type

59 Learning Objective 5: Randomized Block Design
RBD eliminates variability in the response due to the blocking variable; allows for better comparisons to be made among the treatments of interest A matched pairs design is a special case of a RBD with two observations in each block


Download ppt "Chapter 4: Gathering Data"

Similar presentations


Ads by Google