Can we leverage a priori information about monotonicity of parameters to improve estimation?

51 Views Asked by At

Suppose $(X_t)_{t \in\mathbb{N}}$ and $(Y_t)_{t \in \mathbb{N}}$ are two independent sequences of i.i.d. Bernoulli random variables, the first one of parameter $x \in [0,1]$ and the second one of parameter $y \in [x,1]$ (so, in particular $\forall t \in \mathbb{N}, \mathbb{P}[X_t = 1] =x = (1-\mathbb{P}[X_t = 0])$ and $\forall t \in \mathbb{N}, \mathbb{P}[Y_t = 1] =y = (1-\mathbb{P}[Y_t = 0])$).

Our goal is to estimate $y-x$ using $X_1,\dots,X_{T_1},Y_1,\dots,Y_{T_2}$, where $T_1,T_2 \in \mathbb{N}$.

My first guess was to use the estimate $$\frac{1}{T_2} \sum_{t=1}^{T_2} Y_t - \frac{1}{T_1} \sum_{t=1}^{T_1} X_t$$ because, for any $\varepsilon > 0$, we can rely on Hoeffding's inequality \begin{equation*} \mathbb{P}\Bigg[\bigg| \Big(\frac{1}{T_2} \sum_{t=1}^{T_2} Y_t - \frac{1}{T_1} \sum_{t=1}^{T_1} X_t\Big) -(y-x)\bigg| \ge \varepsilon\Bigg] \le 2 \exp\Bigg(-\frac{2\varepsilon^2}{\frac{1}{T_1}+\frac{1}{T_2}}\Bigg) \;, \end{equation*} which, for any $\delta \in (0,1)$, can be re-read as \begin{equation*} \mathbb{P}\Bigg[\bigg| \Big(\frac{1}{T_2} \sum_{t=1}^{T_2} Y_t - \frac{1}{T_1} \sum_{t=1}^{T_1} X_t\Big) -(y-x)\bigg| \ge \sqrt{\frac{1}{2}\Big( \frac{1}{T_1}+\frac{1}{T_2}\Big)\log\Big(\frac{2}{\delta}\Big)}\Bigg] \le \delta\;. \end{equation*}

However, due to noise, it could very well happen that $\frac{1}{T_2} \sum_{t=1}^{T_2} Y_t - \frac{1}{T_1} \sum_{t=1}^{T_1} X_t <0$, meaning that our estimate is completely meaningless given our prior information that $y\ge x$.

I thought, since $\big(\frac{1}{T_1} \sum_{t=1}^{T_1} X_t,\frac{1}{T_2}\sum_{t=1}^{T_2} Y_t\big)$ is the solution to the unconstrained minimization problem relative to the function $$ (x',y') \mapsto \bigg(\sum_{t=1}^{T_1} (X_t-x')^2+ \sum_{t=1}^{T_2}(Y_t-y')^2\bigg) \;,$$

that a better idea could be to return the estimate $\hat{y}_{T_1,T_2}-\hat{x}_{T_1,T_2}$, where $(\hat{x}_{T_1,T_2},\hat{y}_{T_1,T_2})$ is the solution to the corresponding constrained minimization problem, i.e., $$ (\hat{x}_{T_1,T_2},\hat{y}_{T_1,T_2}) \in \mathrm{argmin}_{(x',y') \in [0,1]^2 \\ \textrm{ s.t. }x' \le y'} \Big(\sum_{t=1}^{T_1} (X_t-x')^2+ \sum_{t=1}^{T_2}(Y_t-y')^2\Big) \;. $$ Doing this, by construction, we are guaranteed to have $\hat{y}_{T_1,T_2} - \hat{x}_{T_1,T_2} \ge 0$.

I'm wondering if an analogous to the guarantees we have derived from Hoeffding's inequality for the estimator $\frac{1}{T_2} \sum_{t=1}^{T_2} Y_t - \frac{1}{T_1} \sum_{t=1}^{T_1} X_t$ holds also for the estimator $\hat{y}_{T_1,T_2}-\hat{x}_{T_1,T_2}$, maybe relying on other probabilistic arguments. Here the question:

Do there exist constants $c_1,c_2>0$ such that for every $x,y \in [0,1]$ with $x\le y$, for any $T_1,T_2 \in \mathbb{N}$ and for any $\delta \in (0,1)$ we have that $$\mathbb{P}\Bigg[\bigg|(\hat{y}_{T_1,T_2}-\hat{x}_{T_1,T_2}) -(y-x)\bigg| \ge \sqrt{c_1\Big( \frac{1}{T_1}+\frac{1}{T_2}\Big)\log\Big(\frac{c_2}{\delta}\Big)}\Bigg] \le \delta\;.$$

In this case, might we also guarantee that $c_1 \le \frac{1}{2}$ and $c_2 \le 2$?