Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lecture 2 Outline: Tue, Sep 9 Chapter 1.2: Statistical Inference and Study Design –Types of Inference –Observational Studies vs. Randomized Experiments.

Similar presentations


Presentation on theme: "Lecture 2 Outline: Tue, Sep 9 Chapter 1.2: Statistical Inference and Study Design –Types of Inference –Observational Studies vs. Randomized Experiments."— Presentation transcript:

1 Lecture 2 Outline: Tue, Sep 9 Chapter 1.2: Statistical Inference and Study Design –Types of Inference –Observational Studies vs. Randomized Experiments –Confounding Variables –Design of Experiments –Inference to Populations: random sampling studies vs. non-random sampling studies

2 Drawing Conclusions An inference is a conclusion from the data about some broader context that the data represent. –e.g., one egg in a container is rotten -- the rest are rotten; when we flick on a light switch, the light turns on -- flicking on the light switch causes the light to turn on. A statistical inference is an inference justified by a probability model linking the data to a broader context. Statistical inferences include measures of uncertainty about the conclusions.

3 Two “broader contexts” in statistics Population inference: an inference about population characteristics, like the difference between two population means Causal inference: an inference that a subject would have received a different numerical outcome had the subject belonged to a different group.

4 Causal Questions Medicine: How effective is a new drug? What is the effect of smoking on one’s chance of developing cancer? Psychology: What change in an individual’s normal solitary performance and behavior occurs when people are present? What changes in an individual’s moral behavior occur when the individual is commanded by authority? Economics: What is the effect of a change in taxes on labor supply and investment behavior? What is the effect of a change in the minimum wage on employment? Education: What is the effect of smaller class sizes on achievement?

5 Types of Causal Studies Observational study: Study in which group status is observed, i.e., beyond the control of the researcher. Controlled experiment: Study in which group status is controlled by the researcher. –Randomized experiment: Study in which group status is assigned by a chance mechanism.

6 Examples of Causal Studies Motivation and creativity study (case study 1.1.1) Sex discrimination study (case study 1.1.2) Comparison of chromosomal aberrations of Japanese atomic bomb survivors near blast and those far from blast. Comparison of death rates in Navy and out of Navy during Spanish American war. Comparison of heart attack rates of menopausal women taking estrogen and women not taking estrogen

7 Causal Inference Main lesson: statistical inferences of causation can be made from randomized experiments, but not from observational studies. In an observational study, one cannot rule out the possibility that confounding variables are responsible for group differences in the observed outcome. In an observational study, one cannot rule out the possibility of reverse causality or simultaneous causality. Which came first – the chicken or the egg? Beta-carotene intake and morbidity.

8 Confounding Variables A confounding variable is a variable that is related to both group membership and the outcome. Its presence makes it hard to establish the outcome as being a direct consequence of group membership. Examples: –Sex discrimination study –Death rates in and out of Navy study Although it is possible to control for known confounding variables (via multiple regression), in an observational study we can never be sure that there are not unknown confounding variables that are responsible for group differences in outcome.

9 Association Is Not Causation There is a close relationship between the salaries of Presbyterian ministers in Massachusetts and the price of rum in Havana. Are the ministers benefiting from the rum trade or supporting it? A study showed that cigarette smokers have lower college grades than non-smokers. Does the road to good grades lie in giving up smoking?

10 Do Observational Studies Have Value – Yes! Establishing causation is not always the goal. Establishing causation may be done in other ways. –Experiments not always practical or ethical Analysis of observational data may lend evidence toward causal theories and suggest the direction of future research.

11 Criteria for Establishing Causation From Obs. Studies The association is strong. The association is consistent. Higher doses are associated with stronger responses. The alleged cause precedes the effect in time. The alleged cause is plausible. Examples: –Smoking and lung cancer –Radiation from atomic bomb and cancer

12 Design of Experiments Principles of statistical design of experiments to make causal inferences about different treatments (policies) –Control: Make sure that all other factors besides the treatments are kept the same in the different groups (e.g., use placebo, double blinding). –Randomization: Use an impersonal chance mechanism to assign units to treatments. –Replication: Use many units to reduce chance variation in results.

13 Logic of Controlled Randomized Experiment Randomization produces groups that should be similar in all respects before the treatment is applied. Control in design (i.e., use of the placebo, double blinding, judges see poems in random order) ensures that influences other than the treatment operate equally on the groups. Therefore differences between the treatment and control groups must be due either to the treatment or to the play of chance in the random assignment of units to the groups. Statistical inference provides a method for describing how confident we can be that an observed difference between the treatment and control group did not arise due to chance.

14 Statistical Inference in the Motivation-Creativity Study The creativity scores tended to be larger in the “intrinsic” than in the “extrinsic” group. Either the intrinsic questionnaire caused a higher score or else the more creative writers happened to be placed in the “intrinsic” group. The probability (p-value) associated with this latter possibility is 0.011.

15 Inference to Populations Goal: Make conclusions about aspects of a population (e.g., mean income in U.S.) based on a sample. Two types of sampling designs. –Random sampling study: Units are selected by the investigator from a well-defined population through a chance mechanism with each unit having a known (>0) chance of being selected. –Non-random sampling study: Units selected in way other than through chance.

16 Statistical Inference for Populations Statistical inference can be made for random sampling studies (Ch. 1.4.1) by using the sampling design. –Simple random sample of size n: Subset of population of size n selected in such a way that every subset of n has same chance of being selected. –Sample might be nonrepresentative (e.g., have markedly different characteristics than population) but we can use probability to find the chance of it being nonrepresentative. Statistical inference for populations cannot be made for non-random sampling studies. A non-random sampling study can be nonrepresentative (biased) in unknown ways.

17 The Literary Digest Poll The Literary Digest Poll. In the 1936 presidential election, the Literary Digest predicted an overwhelming victory Landon over Roosevelt. Roosevelt won the election by a landslide – 62% to 38%. What went wrong? The sample was taken by mailing questionnaires to 10 million people whose names and addresses came from sources like telephone books and club membership lists. 2.4 million peopled returned the samples.

18 Biased Samples Selection Bias: When the procedure for selecting a sample results in samples that are systematically different from the population. When a selection procedure is biased, taking a large sample does not help. This just repeats the basic mistake on a large scale. Causes of selection bias: –Voluntary Response Sample –Undercoverage –Nonresponse

19 Statistical Inferences Permitted by Study Designs Examples: –Motivation and Creativity study –Sex Discrimination study –Researchers measured the lead content in teeth and IQ scores for all 3,229 children attending first and second grade between 1975 and 1978 in Chelsea and Somerville, Mass. IQ scores for those with low lead concentrations found to be significantly higher than for those with high lead concentrations. Conceptual Exercises 1-12 in Ch. 1 relate to statistical inferences permitted by study designs.


Download ppt "Lecture 2 Outline: Tue, Sep 9 Chapter 1.2: Statistical Inference and Study Design –Types of Inference –Observational Studies vs. Randomized Experiments."

Similar presentations


Ads by Google