Derive the Hajek pojection of $T_n$.

275 Views Asked by At

Let $X_1, \dots , X_n$ i.i.d. copies of $X$ with distribution $F$ and density $f$. Let $(X_{1:n}, \dots , X_{i:n}, \dots , X_{n:n})$ be the order statistic. For a given $p \in (0, 1)$ consider the Harrell-Davis estimator defined by

$$T_n = \sum_{i=1}^n c_{ni}X_{n:i}$$

$$c_{ni} = \frac{\Gamma(n + 1)}{\Gamma(k)\Gamma(n − k + 1)} \int^{i/n}_{(i−1)/n} u^{k−1}(1−u)^{n−k} du, i = 1,\dots , n, k = [np].$$

I have three questions regarding this. I cannot seem to find any solid clear answers to them in my textbook

1st question: Which parameter $\theta$ is estimated? I'm not sure if I have completely misinterpreted but it is rather unclear what we are estimating?

2nd question: What is the definition of the Hajek projection? I get that the Hajek projection has to do with the Hoeffding Decomposition, but I would really like a definition of the Hajek projection without delving into Hoeffding Decomposition.

3rd question: how does one derive the Hajek projection of $T_n$ from this example?

1

There are 1 best solutions below

0
On

Answering in an itemized list:

  1. $\theta$ is the expectation of your estimator, $T_n$. I'm not familiar with the Harrel-Davis estimator, but a quick google search mentions it estimates the quantiles of a distribution.
  2. Hajek Projection is a projection of a random variable onto the class $\mathcal{S}$, which is the set of all variables of the form $\sum_{i=1}^n g_i(X_i)$, where $g_i \in L^2$, $X_i$ are iid. If you've had a class in Linear Modeling, then you would have been exposed to the "Projection Theorem," working with vectors and matrices. Hajek is a more general form of this.
  3. To calculate directly, the Hajek Projection Principle can be used: $$\hat{S} = \sum_{i=1}^n E \left[ T_n | X_i \right] - \left(n-1\right) E\left[ T_n \right].$$ Altneratively, but probably not suggested here, would be to express $T_n$ as a U-statistic, and use the projection formula specific to U-statistics.