# Estimating the Effects of Treatment on Outcomes with Confidence Sebastian Galiani Washington University in St. Louis.

## Presentation on theme: "Estimating the Effects of Treatment on Outcomes with Confidence Sebastian Galiani Washington University in St. Louis."— Presentation transcript:

Estimating the Effects of Treatment on Outcomes with Confidence Sebastian Galiani Washington University in St. Louis

Parameters of Interest Two parameters of interest widely used in the literature: Average Treatment Effect Average Treatment Effect on the Treated Under randomization and full compliance, they coincide.

Randomization In the absence of difficulties such as noncompliance or loss to follow up, assumptions play a minor role in randomized experiments, and no role at all in randomized tests of the hypothesis of no treatment effect. In contrast, inference in a nonrandomized experiment requires assumptions that are not at all innocuous.

Quasi-Experimental Designs If randomization is not feasible, we need to rely on quasi-experimental methods. In our case, the most promising strategy would be a Generalized Difference in Differences strategy.

Parameters of Interest We might want to response the following questions: What is the effect of the intervention on a given outcome on a given population? What is the effect of the intervention on a given outcome on those that self-select as users of the facilities? The power for identifying the first parameters might be lower than the power for the second parameter –identified by IV Methods.

Distance to Facilities and Sampling Think about stratifying the sample by distance to facilities, and over-sample households residing near facilities.

Testing absence of Treatment Effects. Type I Error Once we have chosen a Type I error rate: , the null hypothesis (  T -  C = 0) is rejected whenever the statistics of contrast |t| > t*  ; where t*  is the (critical) value of t that defines the  /2 percentile of the distribution of t. 0 -t*  /2 t*  /2 No rejection Reject Null

Type II Error Now consider an alternative hypothesis:  T -  C = d Under this alternative hypothesis, the t-statistic will have a different distribution. If the alternative hypothesis is true, we want to reject the null hypothesis as often as possible. To fail to do so would be a Type II error. We want to restrict the probability of this type of error to  Then  will be the type II error rate of the test. And 1-  will be the power of the statistical test. The power of a statistical hypothesis test measures the test's ability to reject the null hypothesis when it is actually false.

Type II Error 0-t*  /2 t*  /2 No rejection Reject Null Distribution of t under the null Distribution of t under the alternative Power 1-   There is a trade-off between Type I and Type II errors

Type I and II Errors 0-t*  /2 t*  /2 No rejection Reject Null Distribution of t under the null Distribution of t under the alternative Power 1-   Both errors can be simultaneously reduced if the dispersion of the statistics is reduced.

Simple Clinical Trial In this design m members are allocated to each condition: treatment and control. The observed value to the i-th member in the l-th condition is a function of the grand mean and the effect of the l-th condition; any difference between the observed and the predicted value is allocated to the residual error. The intervention effect is C l. Its estimate is:

Simple Clinical Trial Under the null hypothesis, H 0 : C l = 0, Let estimate the variance of. Assuming that the residual error is distributed Gaussian, the intervention effect is evaluated using a t-statistic with the appropriate df. The researcher determines the desired Type I and II error rates (say 5% and 20%, respectively). The researcher expects a negative intervention effect but would be concerned about a positive effect; as a result, she chooses a two-tailed test.

Simple Clinical Trial Given the random assignment of members to treatment and control conditions, it is reasonable to assume that the two study conditions are independent. Then: The estimated variance of a single condition mean is: If we assign the same number of members in each condition, the variances in the two conditions are assumed to be equal.

Simple Clinical Trial Then, the t-statistic is estimated as: The parameters appearing in this formula are relatively easy to estimate using data from previous reports, from analyses of existing data or from preliminary studies.

Simple Clinical Trial True type I and II errors rates will be  and  respectively if: 0-t*  /2 t*  /2 No rejection Reject Null Distribution of t under the null Distribution of t under the alternative Power 1-   -t*  /2 -t*  H0H0 HAHA

Or: This expression is general to any design. We need: –Desired type I and II error rates. –The expected magnitude of the treatment effect. –The expression for the variance of the estimated treatment effect, which is a function of the sample size. We can express any of these variables in terms of the others. General Expression

Simple Clinical Trial Sample size:

Sample Size in GRT Assume that there is only one individual per household. Probability that an individual has diarrhea: Individual i, in group k, assigned to condition l. Within each condition: the variance of any given observation is: Where stands for the variance within groups and for the variance between groups

Sample Size in GRT Consider first the group mean. If that mean were based on m independent observations, the variance of that mean would be estimated as: However, because the members within an identifiable group almost always show positive intra-class correlation, those observations are not independent. In fact, only the variance attributable to the individual effect will vanish as m increases. The variance attributable to the group effect will remain unaffected.

Sample Size Then, the variance of the group mean is: Where, m stands for the number of households per group and ICC for the intra-group correlation. The variance of the condition mean is: Where g is the number of groups in each condition

Sample Size When ICC>0, the variance of the condition mean is always larger in a GRT than in a study based on random assignment of the same number of individuals to the study conditions. Statistic of interest: Variance of the statistic:

Sample Size Given a moderate number of groups per condition, the t-statistic to asses the difference between condition means is: It is distributed t-student with g T +g C -2 degrees of freedom

Sample Size Sample size: –Number of groups per condition: –Number of household per group (it requires a couple of iterations):

Sample Size When each household has more than one observation, we need to perform the following correction: Where a is the number of observations per household and  is the intra-household correlation. See extreme cases  or 

Pretest - Posttest: Repeat Observations on Groups Data are collected in each condition before and after the intervention has been delivered in the intervention condition. There are repeated observations on the same groups

Pretest - Posttest: Repeat Observations on Groups The model: The observed value for the i-th member nested within the k-th group and l-th condition and observed at the j-th time is expressed as a function of the grand mean, the effect of the l-th condition, the effect of the j-th time, the joint effect of condition and time, the realized value of the k- th group, the joint effect of group and time. Differences between this predicted value and the observed value are allocated to the residual error

Pretest- Posttest: Repeat Observations on Groups Treatment effect: Dif-in-dif There are two sources of variation among the groups: –Variation due to group effect –Variation due to the interaction group x time The first difference eliminates the first source of variation. The group mean is: This model can be easily transformed in the basic GRT

Pretest - Posttest: Repeat Observations on Groups The variance of the group mean is: Following the same steps as before… The variance of the intervention effect can be written as: Sample size can be solved as before

Pretest - Posttest: Repeat Observations on Members Data are collected in each condition before and after the intervention has been delivered in the intervention condition. There are repeated observations on the same members

Pretest- Posttest: Repeat Observations on Members The model: The observed value for the i-th member nested within the k-th group and l-th condition and observed at the j-th time is expressed as a function of the grand mean, the effect of the l-th condition, the effect of the j-th time, the joint effect of condition and time, the realized value of the k-th group, the realized value of the i-th member, the joint effect of group and time and the joint effect of member and time. Differences between this predicted value and the observed value are allocated to the residual error

Pretest- Posttest: Repeat Observations on Members Treatment effect: Dif-in-dif There are three sources of variation among the members: –Variation due to member effect –Variation due to the interaction member x time –Error term The first difference eliminates the first source of variation.

Pretest- Posttest: Repeat Observations on Members Taking differences by members: This model can be easily transformed in the basic GRT The variance of the intervention effect can be written as: Sample size can be solved as before

Download ppt "Estimating the Effects of Treatment on Outcomes with Confidence Sebastian Galiani Washington University in St. Louis."

Similar presentations