Pre-analysis plans Module 8.3. Recap on statistics If we find a result is significant at the 5% level, what does this mean? – there is a 5% or less probability.

Presentation on theme: "Pre-analysis plans Module 8.3. Recap on statistics If we find a result is significant at the 5% level, what does this mean? – there is a 5% or less probability."— Presentation transcript:

Pre-analysis plans Module 8.3

Recap on statistics If we find a result is significant at the 5% level, what does this mean? – there is a 5% or less probability it is the result of chance If we test 10 independent hypotheses there is a 40% chance we will fail to reject the null at the 5% level at least once – i.e. 40% chance find one hypothesis is significant at the 5% level

Publication bias: what is it? Evaluations with positive or negative significant impacts more likely to get published If enough evaluations are run on a given type of program, some will give positive result by chance Those with a vested interest may even deliberately run many studies and publicize only some of the results Published results will suggest the program is effective even if it isn’t

Study registers as a solution All studies are registered before the results are known Shows how many studies examined a type of program If 20 studies are started on a given question and only one positive study is published, we may worry the others found no result Incentives for registration – Some medical journals require registration before study started for publication New registries for RCTs in social sciences – E.g by the American Economic Association

Coverage of registries Q: what are the advantages of having a registry that only covers RCTs? Q: Is publication bias only a problem for RCTs? – If not why not have a registry for non RCTs? Q: Should registration only be allowed before the start of a study?

Data mining: what is it? Looking at the data many different ways, trying to find the result you want If test impact of program on many different outcomes, some will show positive (or negative) impact by chance If test impact of program on many different subgroups, some will show positive (or negative) impact by chance We may be falsely accused of data mining – E.g. we test one subgroup and report the results but readers think maybe we tested many and only reported the one that was significant

Preanalysis plans as a solution Write down in advance how the data will be analyzed – What outcomes are of primary interest – What subgroups will be examined Register the plan with some external organization When presenting results, show all those covered in the PAP – highlight any deviations from or additions to the PAP

When is a PAP most useful? When a study has a large number of outcomes with no obvious hierarchy of which are the most important When researchers know they are interested in differential impact on different subgroups (heterogeneous treatment effects) When researchers are concerned others will push them to find positive impacts When want to adjust statistical tests for multiple hypothesis tests

Incentives for immunization (no PAP) Program to increase immunization in rural India with two arms – Predictable and reliable immunization camps – Camps plus 1kg lentils for each shot and set of plates on completion of full immunization One main outcome (immunization), more than one indicator – Ever received any immunization – % with full immunization Showing two indicators illustrated why the program worked – Regular camps increased children with at least one shot – Incentive increased % with full immunization – Suggests unwillingness to immunize not main barrier to full immunization, instead incentive helped maintain persistence

Results: number of immunizations Source: Banerjee et al 2010

Results fully immunized Source: Banerjee et al 2010

Community driven development (with PAP) Evaluation of the impact of a CDD program including: – Quality and quantity of public goods – Social capital – Trust – Participation Each outcome area had many indicators, eg many public goods – >300 outcome indicators in total Several important subgroups to examine: – Women and youth primary targets – Program implemented in two very different regions

When should a PAP be written? Before the baseline? – Prevents us from learning about what questions have low response rates or inconsistent answers Before the program starts? – prevents researchers taking advantage of random shocks which happen to mainly impact treatment or comparison – Also prevents researchers thinking of new hypotheses, e.g. unthought-of of negative consequences Before looking at any data? – Can be useful to look at comparison data to determine appropriate control variables – drop variables where little chance of improvement (e.g. 95% of control already do)

What should be covered in a PAP? Main outcome measures Which outcome measures are primary and which secondary The precise composition Subgroups to be analyzed Direction of expected impact if we want to use a one- sided test Primary specification to be used – What control variables to include – How is the outcome variable specified, in logs, changes etc

Disadvantages of a PAP Any analysis that is not included in the PAP has less credibility – Only do a PAP if you have the time to think it through carefully Sometimes patterns in the data tell consistent stories we never thought of – We might want the flexibility to pursue these With complex evaluations it can be hard to think through all outcome combinations and how analysis should proceed with each – We may do different secondary analysis if the impact is positive vs. negative – One option is to do PAP in stages, look at some data, then write another PAP

Download ppt "Pre-analysis plans Module 8.3. Recap on statistics If we find a result is significant at the 5% level, what does this mean? – there is a 5% or less probability."

Similar presentations