On the bounds of estimated conditional correlations and a follow-up question on the inferred properties of underlying structural parameters.

24 Views Asked by At
  • Framework

It is assumed that the data is Gaussian and follows the following structural equation model/additive noise model $$ Y = \sum_{j=1}^{p} X_j \theta_j + \epsilon$$ $$ ||\theta||_0=s<<n $$ where $p>n$. Our goal is to find the set of covariates $X_j$ so that $\theta_j \neq 0$, so that the dimensionality of predictors is reduced.

In a model selection process, a covariate $X_j$ is NOT selected if the conditional correlation with the target variable $Y$ given some controls $X_{\mathbf K}$ is equal zero. The following is the decision rule to select a covariate: For $j \in \{1,...,p\}, \mathbf{K}\subseteq\{1,...,p\}\setminus \{j\}$ reject the null-hypothesis $cor (Y,X_j|X_\mathbf{K})=0$ against the two-sided alternative $cor(Y,X_j|X_\mathbf{K})\neq 0$ if \begin{align} \sqrt{n-|\mathbf{K}|-3}\mathrm{Z}(Y,X_j|X_\mathbf{K})> \Phi^{-1}(1-\alpha /2), \end{align} where $\mathrm{Z}(.)$ denotes the Fisher's z transform and $\alpha$ the significance level. \begin{align} \mathrm{Z}(Y,X_j|X_\mathbf{K})=\frac{1}{2} \mathrm{log}\left(\frac{1+\hat{r} (Y,X_j|X_\mathbf{K})}{1-\hat{r} (Y,X_j|X_\mathbf{K})}\right) \end{align} where $\Phi(.)$ denotes the cdf of the standard Gaussian.

  • Is the following correct?

It follows that the decision rule implies a threshold $t_{\alpha}:= tanh\left(\frac{\Phi^{-1}(1-\alpha/2)}{n-|\mathbf K|-3}\right)$ for the estimated conditional correlation, which satisfies $t_{\alpha}=O(\sqrt{1/n})$ for $|\mathbf K|<<n, \alpha > 0$.

  • Given the above holds, can the following be inferred?

Notational remark: $M_\mathbf{K}=I-X_\mathbf{K}(X_\mathbf{K}'X_\mathbf{K}^{-1})X_\mathbf{K}'$

If $\hat r(Z,X_j|X_{\mathbf K}) \leq t_{\alpha}$ then $\hat r(Z,X_j|X_{\mathbf K})= O(\sqrt{1/ n})$. Since $\hat r(Z,X_j|X_{\mathbf K})$ is a multiplicative function of $\theta_j, Var(X_j M_{\mathbf K})^{-1}, Cov(X_j M_{\hat {\mathbf S}}, X_k M_{\mathbf K})$ with $k\in \mathbf K$, and the latter two arguments are bounded by a constant (by assumption), respectively, we have that $\theta_j=O(\sqrt{1/n})$.