Propensity Score Perplexities

Today, I share yet another problem that I encountered in real life (slightly modified), but which would make an excellent addition to a problem set.

Exercise 1 (Inverse propensity weighted regression): You want to test the effect of treatment 1 on an individual’s responsiveness to treatment 2. You therefore run a series of experiments where you randomize both treatments 1 and 2. However, due to operational constraints, the probability of receiving treatment 1 differs across experiments (but the probability of getting treatment 2 is stable). Let D_1, D_2 respectively be dummies for treatments 1 and 2.

  1. Model this problem as in Rubin (1974), and give a counterexample to show that in general, fitting the model Y = \alpha + \beta_1 D_1 + \beta_2 D_2 + \beta_3 D_3 + \varepsilon using OLS and looking at \beta_3 will give the wrong answer.
  2. Let e index the experiments you are trying to pool together and let p_e be the probability of ending up in treatment within a given experiment. Consider the following procedure: Estimate the model \frac{Y_{i,e,t}}{p_e} = \alpha_{t} + \beta_t \frac{D_2}{p_e} + \varepsilon via OLS on the treated individuals across experiments. Next, estimate the model \frac{Y_{i,e,c}}{1 - p_e} = \alpha_{c} + \beta_c \frac{D_2}{1 - p_e} + \varepsilon via OLS. Show that under the standard regularity conditions, \hat\beta_t - \hat\beta_c consistently estimates the difference in differences in treatment effects between D_1 = 1 and D_1 = 0. Will this estimator be unbiased?

Exercise 2: Consider the setting above

  1. Now, suppose you just want to estimate the treatment effect for treatment 1. Show that the following “regression through the mean” approach will be consistent: First, fit the model \frac{Y_{i,e,t}}{p_e} = \frac{\alpha_{t}}{p_e} + \varepsilon via OLS. Second, fit the model \frac{Y_{i,e,c}}{1 - p_e} = \frac{\alpha_{c}}{1-p_e} + \varepsilon via OLS. Show that $\hat\alpha_t – \hat\alapha_c$ consistently estimates the pooled average treatment effect across experiments
  2. The above estimator is almost the standard inverse propensity weighted difference in means. Give an expressions for \frac{\hat\mu_{IPWE,t}}{\hat\alpha_t} and \frac{\hat\mu_{IPWE,c}}{\hat\alpha_c}.
  3. Use the previous part and Slutsky’s theorem to show the asymptotic equivalence of the two estimators. Are there any settings where we think one will perform better in finite samples?

I never thought I would be using my causal inference skills to try to correct for someone else messing up their experiment design.

Leave a comment