Upper bound of the bias between the ATEs using estimated IPW and true IPW

32 Views Asked by At

This is a causal inference problem.

Assume we have $n$ iid subjects i=1,..., n, we have feature vector$X_i \in R^p$, a response $Y_i$and treatment assignment $W_i \in \{0,1\}$, potential outcomes are $Y_i(0)$and $Y_i(1)$ and $Y_i=Y_i(W_i)$.

ATE is the average treatment effect. Under unconfoundedness, $[\{Y_i(0),Y_i(1)\}\perp W_i] | X_i$. We have ATE = $\tau=E[Y_i(1)-Y_i(0)]=E[\frac{W_iY_i}{e(X_i)}-\frac{(1-W_i)Y_i}{1-e(X_i)}]$ where $e(x)=P[W_i=1|X_i=x]$.

Inverse-propensity weighted estimator is unbiased if we know the propensity score $e(.)$: $\hat\tau_{IPW}^{\ast}=\frac{1}{n}\sum^n_{i=1}(\frac{W_iY_i}{e(X_i)}-\frac{1-W_iY_i}{1-e(X_i)}),\space E[\hat\tau_{IPW}^{\ast}]=\tau$. $\hat\tau_{IPW}^{\ast}$ is Oracle IPW estimator.

If we first estimate propensity score $\hat e(\cdot)$ and then inverse-propensity weighted estimator is $\hat\tau_{IPW}=\frac{1}{n}\sum^n_{i=1}(\frac{W_iY_i}{\hat e(X_i)}-\frac{1-W_iY_i}{1-\hat e(X_i)})$.

Since $\hat e(\cdot)$ was estimated, there is error between estimated value $\hat e(\cdot)$ and the true propensity score $e(\cdot)$. Therefore, there is difference between the IPW estimators of ATE using estimated propensity score $\hat\tau_{IPW}$ and true propensity score $\hat\tau_{IPW}^{\ast}$.

The difference between them is $\hat\tau_{IPW}-\hat\tau_{IPW}^{\ast}=\frac{1}{n}\sum_{i=1}^n((\frac{W_i}{\hat e(X_i)}-\frac{1-W_i}{1-\hat e(X_i)})-(\frac{W_i}{e(X_i)}-\frac{1-W_i}{1-e(X_i)}))Y_i$.

Assume that $\hat e(X_i)$ satisfies the overlap condition, which means $0<\eta\leq e(X_i)\leq1-\eta<1$. Then, by Cauchy-Schwartz,

$\hat\tau_{IPW}-\hat\tau_{IPW}^{\ast}\leq\sqrt{\frac{1}{n}\sum_{i=1}^n((\frac{W_i}{\hat e(X_i)}-\frac{1-W_i}{1-\hat e(X_i)})-(\frac{W_i}{e(X_i)}-\frac{1-W_i}{1-e(X_i)}))^2}\cdot\sqrt{\frac{1}{n}\sum_{i=1}^nY_i^2}$ (1)

$\asymp \sqrt{\frac{1}{n}\sum_{i=1}^n(\hat e(X_i)-e(X_i))^2}$ (2)

I know the inequality in (1) is by Cauchy-Schwartz. Can anyone show the proof for asymptotic in (2)? Thanks a lot.

The possible reference is https://web.stanford.edu/~swager/stats361.pdf (page 14, page 53) and https://drive.google.com/file/d/11eBqcaexYWX0OYACwUQ6fRHK2Or5foLa/view (page 7).