Today, I share yet another problem that I encountered in real life (slightly modified), but which would make an excellent addition to a problem set.
Exercise 1 (Inverse propensity weighted regression): You want to test the effect of treatment 1 on an individual’s responsiveness to treatment 2. You therefore run a series of experiments where you randomize both treatments 1 and 2. However, due to operational constraints, the probability of receiving treatment 1 differs across experiments (but the probability of getting treatment 2 is stable). Let respectively be dummies for treatments 1 and 2.
- Model this problem as in Rubin (1974), and give a counterexample to show that in general, fitting the model
using OLS and looking at
will give the wrong answer.
- Let
index the experiments you are trying to pool together and let
be the probability of ending up in treatment within a given experiment. Consider the following procedure: Estimate the model
via OLS on the treated individuals across experiments. Next, estimate the model
via OLS. Show that under the standard regularity conditions,
consistently estimates the difference in differences in treatment effects between
and
. Will this estimator be unbiased?
Exercise 2: Consider the setting above
- Now, suppose you just want to estimate the treatment effect for treatment 1. Show that the following “regression through the mean” approach will be consistent: First, fit the model
via OLS. Second, fit the model
via OLS. Show that $\hat\alpha_t – \hat\alapha_c$ consistently estimates the pooled average treatment effect across experiments
- The above estimator is almost the standard inverse propensity weighted difference in means. Give an expressions for
and
.
- Use the previous part and Slutsky’s theorem to show the asymptotic equivalence of the two estimators. Are there any settings where we think one will perform better in finite samples?
I never thought I would be using my causal inference skills to try to correct for someone else messing up their experiment design.