Beruflich Dokumente
Kultur Dokumente
David Yanagizawa-Drott
Assistant Professor of Public Policy
Evidence for Policy Design Harvard Kennedy School
August 28, 2012
Lecture Overview
How to evaluate an evaluation Internal validity
What type of evaluation? What is used as the counterfactual? What are the implicit assumptions?
External validity
Evaluating an Evaluation
Many impact evaluations are not rigorously designed to best estimate true program impact. It is key to know what the causal impact of a program is, in order to decide when and how much to invest in it. As consumers of evaluations, we must be careful about the evidence we accept.
Evaluating an Evaluation
Evaluation Example
Intervention: Suppose ADB would like to know the impact of this workshop on all of you
Ask Yourself
1.
What type of evaluation?
Simple Difference
The outcome of interest for the What 2. represents the comparison group after the counterfactual? intervention.
What assumptions must hold?
3.
Participants and comparison group are similar except for program participation.
Which of the following might be a problem for the evaluations internal validity? 1. Similarly high proportion of women 2. Staying on lower floors 3. Less evaluation experience 4. Also Asian policymakers
Pre-Post Evaluation
Instead of a Simple Difference study, suppose you are assessing an evaluation that used the Pre-Post method. The comparison group:
Test results of participants right before the workshop
Pre-Post Evaluation
Ask Yourself.
What type of 1. evaluation? Pre-Post Evaluation.
2.
What The same individuals before the represents the intervention. counterfactual?
What assumptions must hold?
3.
No other factor contributed to any change in the measured outcome over time.
The change in outcome reflects processes that occur simply over time and that are unrelated to the program.
Evaluation Example
Intervention: Suppose the agriculture ministry has implemented a large-scale, six-month training program on how to use new hybrid seeds. Outcome of interest: Farm yields
Difference-in-Differences
Now suppose you are assessing an evaluation that used D-in-D. The comparison group:
Villages that did not participate in the training program Their yields are measured before and after the training Comparison of change in yields across the two types of villages
Difference-in-Differences
Estimated Impact: 2 tons per hectare
Non-participating villages
Participating villages
Ask Yourself
1.
What type of
evaluation?
Difference-in-Differences
The trend in outcome over the What 2. represents the same period in the comparison counterfactual? group.
What 3. assumptions must hold?
Parallel trends assumption outcomes would have followed the same trend in absence of program.
Differences-in-Differences
Suppose the program targeted villages that had been experiencing declining yields in previous years.
Is the fact that villages with declining yields were targeted a threat to internal validity? 1. Yes 2. No 3. Dont know
Randomized Evaluation
Now suppose you are assessing an evaluation that is a randomized experiment. The comparison group:
Villages that were randomly selected for the control group, and did not receive the training
Randomized Evaluation
Randomized Evaluation
What type of 1. evaluation?
Randomization
What The participants assigned to the 2. represents the control group. counterfactual?
What 3. assumptions must hold?
Randomization worked; the two groups are identical on average on observed and unobserved factors.
Selective Attrition
Attrition is related to being assigned to the program and unaccounted for in the analysis.
Evaluating an evaluation
Making bigger assumptions (which are less likely to hold) increases the probability of incorrectly estimating the impact
Always assess the assumptions necessary for an evaluation to see if the assumptions are likely to hold Across impact evaluation methods, assumptions are most easily assessed and most likely to hold in a randomized evaluation
Evaluation Example
Intervention: The government decides to offer a subsidy for voluntary health insurance to people in rural areas. All households are eligible. Outcome of interest: Out-of-pocket health spending.
Evaluation method: Two years into the program, an evaluation compares spending for households that signed up for the program to spending for households that didnt sign up.
Estimated Impact: 25% reduction in spending
Is it likely that the evaluation produced a credible estimate of the programs impact? A. Yes B. No C. Dont know
Is it likely that the evaluation produced a credible estimate of the programs impact? A. Yes B. No C. Dont know
The estimate is unlikely to capture the true program effect. Selection: Relatively sick people are more likely to sign up for the program. It could also be that poor people demand health insurance more. The comparison group is therefore not valid.
Lecture Overview
How to evaluate an evaluation Internal validity
What type of evaluation? What is used as the counterfactual? What are the implicit assumptions?
External validity
External Validity
Population
Setting
Scale
Take-Aways
Some impact evaluation methods are generally considered as more reliable than others. Some methods, such as Simple Difference and Pre-Post, can easily provide false conclusions about program impact. Always identify and test your assumptions in order to assess the strength of an impact evaluation When evaluating an evaluation, first assess the internal validity, then the external validity.