About setwise convergence on open sets of nearest neighbor algorithm.

Question

About setwise convergence on open sets of nearest neighbor algorithm.

105 Views Asked by Bumbble Comm At 11 May 2026 - 6:22

Let $(\Omega,\mathcal{F},\mathbb{P})$ be a probability space and $(\mathcal{X},d)$ be a complete, separable, locally compact metric space. Suppose that $X,X_1,X_2,X_3,... : \Omega\to\mathcal{X}$ are $\mathbb{P}$-i.i.d. random variables.

Define: $$\forall m\in\mathbb{N}, \pi_m: \mathcal{X}\times\mathcal{X}^m\to\{1,...,m\}, (x,x_1,...,x_m)\mapsto \min\left(\operatorname{argmin}_{k\in\{1,...,m\}}\left(d\left(x,x_1\right),...,d\left(x,x_m\right)\right)\right).$$ Define: $$\forall m\in\mathbb{N}, Z_m:\Omega\to\mathcal{X}, \omega\mapsto X_{\pi_m\left((X(\omega),X_1(\omega),...,X_m(\omega)\right)}(\omega).$$

If $A$ is a open set of $(\mathcal{X},d)$, is it true that: $$\limsup_{m\to+\infty}\mathbb{P}_{Z_m}(A)\le\mathbb{P}_{X}(A)?$$

Edit 1: or maybe that there exists a constant $C>0$ independent of $A$ such that: $$\limsup_{m\to+\infty}\mathbb{P}_{Z_m}(A)\le C\cdot \mathbb{P}_{X}(A)?$$

If it is false in general, what if we add the hypothesis that $\mathcal{X}=\mathbb{R}^n$, $d$ is the Euclidean distance and $\mathbb{P}_X$ is absolutely continuous w.r.t. Lebesgue measure in $\mathbb{R}^n$?

Edit 2: in this last case, we have that if $B$ is a ball of $\mathbb{R}^n$, then $\mathbb{P}_X(\partial B)=0$ so, since $\mathbb{P}_{Z_m}\to \mathbb{P}_{X}$ in distribution (as explained below by WoolierThanThou), we have that $\mathbb{P}_{Z_m}(B)\to\mathbb{P}_{X}(B), m\to \infty$. Now, since by Besicovitch covering theorem there exist an universal constant $C_n\in\mathbb{N}$ such that every open subset $A$ is the union of at most $C_n$ open sets $A_1,...,A_{C_n}$ each of them is a disjoint countable union of open balls, say $A_i = \cup_{j\in I_i} B_{i,j}$, we have that: $$\mathbb{P}_{Z_m}(A)\le \sum_{i=1}^{C_n}\mathbb{P}_{Z_m}(A_i)= \sum_{i=1}^{C_n}\sum_{j\in I_i}\mathbb{P}_{Z_m}(B_{i,j}) = (*)$$ Now, if only we could exchange the limit and the series, we obtain that $$(*)\to \sum_{i=1}^{C_n}\sum_{j\in I_i}\mathbb{P}_{X}(B_{i,j})= \sum_{i=1}^{C_n}\mathbb{P}_{X}(A_i)\le \sum_{i=1}^{C_n}\mathbb{P}_{X}(A) = C_n \mathbb{P}_{X}(A) $$ Can anyone see a reason why could we exchange the limit an the series?

Original Q&A

There are 2 best solutions below

**Bumbble Comm** · Answer 1 · 2019-10-09 09:31:00

Well, $Z_m$ converges to $X$ in probability! For any, $\varepsilon>0,$ we have, by independence and identical distribution,

$\mathbb{P}(d(Z_m,X)\geq \varepsilon)=\mathbb{P}( \cap_{k=1}^m (d(X_k,X)\geq \varepsilon))=\mathbb{P}( d(X_1,X)\geq \varepsilon)^m$,

which goes to $0$, so long as we can show that $\mathbb{P}( d(X_1,X)\geq \varepsilon)\neq 1$. Clearly, the universe is broken if this isn't true, but you can note that if $(x_n)_{n\in \mathbb{N}}$ is a dense, countable subset of $\mathcal{X}$, then $\mathcal{X}=\cup_{n=1}^{\infty}B(x_n,\frac{\varepsilon}{2}),$ and hence, there must be some $n_0\in\mathbb{N}$ such that $\mathbb{P}(X\in B(x_{n_0},\frac{\varepsilon}{2}))\neq 0.$ Hence, by the fact that they're i.i.d., $\mathbb{P}(X,X_1\in B(x_{n_0},\frac{\varepsilon}{2}))\neq 0$, establishing the desired.

Of course, one might ask, does convergen in probability still imply weak convergence of the underlying measures? Well, assume $Y_n\to Y$ in probability with $Y_n$ and $Y$ $\mathcal{X}$-valued and fix $C\subseteq \mathcal{X}$ closed. Then, for any $\varepsilon>0$, denote $C_{\varepsilon}=\cup_{x\in C} B(x,\varepsilon)$

$$\mathbb{P}(Y_n\in C)\leq \mathbb{P}(d(Y_n,Y)\geq \varepsilon)+\mathbb{P}(Y\in C_{\varepsilon})$$

By continuity of measures (and the fact that $C$ is closed), we in particular find that

$$\limsup_{n\to\infty}\mathbb{P}(Y_n\in C)\leq \limsup_{n\to\infty} \mathbb{P}(Y\in C_{1/n})=\mathbb{P}(Y\in C),$$

which is equivalent to weak convergence of distributions in a complete, separable space.

Adding these things up, we get for $A\subseteq \mathcal{X}$ open, by weak convergence, that $\liminf_{m\to\infty} \mathbb{P}(Z_m\in A)\geq \mathbb{P}(X\in A)$, so your hypothesis is only true if $\lim_{m\to\infty}\mathbb{P}(Z_m\in A)=\mathbb{P}(X\in A)$.

**Bumbble Comm** · Answer 2 · 2019-10-27 00:23:48

Here's a proof for $\mathbb{R}^n$ with Euclidean distance and Lebesgue measure restricted to disk of radius $\frac{1}{\sqrt{\pi}}$.

The idea of the proof (specifically to contrast the flawed proof above) is to use the symmetry/averaging in $x$. In fact, averaging in $x$ allows for a simpler expression for $P(Z_m \in A)$.

Note $P(Z_m \in A) = \int_{x \in \Omega} \int_{x_1,\dots,x_m \in \Omega} 1_{\text{closest $x_i$ to $x$ is in $A$}} = \int_{x \in \Omega} m\int_{x_1,\dots,x_m \in \Omega} 1_{\text{$x_1$ is closest to $x$ and $x_1 \in A$}} = \int_{x \in \Omega} m \int_{x_1 \in A} \mu(B_{d(x,x_1)}(x)^c)^{m-1}d\mu(x_1)d\mu(x) = m\int_{x_1 \in A} \int_{x \in \Omega} (1-|x-x_1|^2)^{m-1}d\mu(x)d\mu(x_1) \le m\int_{x_1 \in A}\int_0^{2\pi}\int_0^1 (1-r^2)^{m-1}r dr d\theta d\mu(x_1) = \mu(A)$.

The following answer is (seriously) flawed. It proves that, in some metric spaces, there is some $x$ so that if you always do nearest neighbor algorithm relative to $x$, the $Z^x_m$ measure of a set $A$ can be arbitrarily (multiplicatively) larger than the $X$-measure of $A$. However, as Bob pointed out, in this problem, you do nearest neighbor algorithm relative to an $X$-randomly chosen point of $\mathcal{X}$.

All of it is false.

Let $\mathcal{X} = \{0 \le |z| < 1\} \subseteq \mathbb{R}^2$, $d$ the standard Euclidean metric, and $P$ pushforward to the Lebesgue measure (i.e. we choose points randomly). Let $A_N = \cup_{k \ge N} \{\frac{1}{2^{10^{2k+1}}} < |z| < \frac{1}{2^{10^{2k}}}\}$. Clearly $A_N$ is open for each $N$. Also, $\mu(A_N) \to 0$ as $N \to \infty$. But, for each $N \ge 1$, we have that $\limsup_{m \to \infty} P_{Z_m}(A) = 1$. Indeed, with $x$ the origin, $$P(|Z^x_m| \in [\frac{1}{2^{10^{2k+1}}},\frac{1}{2^{10^{2k}}}]) \ge 1-P(|Z^x_m| < \frac{1}{2^{10^{2k+1}}})-P(|Z^x_m| > \frac{1}{2^{10^{2k}}})$$ $$\ge 1-m\frac{\pi}{(2^{10^{2k+1}})^2}-(1-\frac{\pi}{(2^{10^{2k}})^2})^m,$$ so we can take $m = 2^{3\cdot 10^{2k}}$ to see that $P(Z^x_m)$ is exponentially close to $1$. $\square$

About setwise convergence on open sets of nearest neighbor algorithm.

There are 2 best solutions below

Related Questions in PROBABILITY-THEORY

Related Questions in MEASURE-THEORY

Related Questions in PROBABILITY-LIMIT-THEOREMS

Related Questions in GEOMETRIC-MEASURE-THEORY

Related Questions in GEOMETRIC-PROBABILITY

Trending Questions

Popular # Hahtags

Popular Questions