i was reading this paper regarding the uncertainty in machine learning. My issue is with a mathematical definition of average calibration and the estimator presented in section 3.
Given 2 random variables $\bf{X} $, $\bf{Y} $ with x,y their realized values. For any random variable, $\mathbb{F}$ is the true cdf, its inverse $\mathbb{Q}$ is the quantile function , and $f$ the corresponding density function. $\hat{\mathbb{F}}$ and $\hat f$ are instead estimates of these functions.
the average calibration is defined as the probability of observing the target below the quantile prediction average over $\mathbb{F}_{X}$
$$p^{obs}_{avg}(p):= \mathbb{E}_{x \sim F_{X}} \biggl [ \mathbb{F}_{\bf{Y}|x}\bigl( \mathbb{\hat Q}_p(x) \bigr) \biggr], \forall p \in(0,1) $$
Given a finite dataset $D$ they say that this quantity can be estimated with
$$\hat p^{obs}_{avg}(D,p)= \frac{1}{N}\sum_{i=1}^N \mathbb{I} \{y_i \leq \mathbb{\hat Q}_p(x)(x_i) \} $$
they also add that "It may be possible to have an uninformative, yet average calibrated model. For example, quantile predictions that match the true marginal quantiles of FY will be average calibrated, but will hardly be useful since they do not depend on the input x."
i don't understand why that estimator should tend to $p^{obs}_{avg}$ and why this last statement should be true.