Weighted median of distribution functions

26 Views Asked by At

I am working on the following barycenter problem: Suppose we are given $N>1$ probability measures on $\mathbb{R}$ with cumulative distribution functions $F_1,\dots,F_N$ and weights $a_1, \dots, a_N \in \mathbb{R}_{>0}$ summing up to $1$. We are interested in the minimization problem $$ \arg\min_{y \in \mathbb{R}} \sum_{i=1}^{N} a_i |y-F_i(x)|$$ for $x\in\mathbb{R}$ and set $F(x) = \min \{ \arg\min_{y \in \mathbb{R}} a_i \sum_{i=1}^{N} |y-F_i(x)| \}$. Then $F(x)$ is the weighted median of the $F_i(x)$, i.e. if $\sigma$ is a permutation such that $F_{\sigma(1)}(x) \leq F_{\sigma(2)}(x) \leq \dots \leq F_{\sigma(N)}(x)$, then $F(x) = F_{\sigma(k^*)},$ where $k^* = \min \{k \in \{1,\dots,N\}\colon \sum_{i=1}^{k}a_{\sigma(k)} \geq \frac{1}{2}\}.$ This way we can ensure that we always have $F(x)=F_i(x)$ for some $i \in \{1,\dots,N\}$.

I would like to show that $F$ is again a cdf so that it defines a probability measure. I have already shown right-continuity and that $\lim_{x \to \infty}F(x) = 1, \lim_{x \to -\infty}F(x) = 0$, but I have struggles showing that $F$ is non-decreasing. It is clear to me that this is the case on intervals on which the $F_i$ do not cross each other (the $F_i(x)$ do not change their order), but I don't know how to formally show monotonicity when the $F_i$ cross each other. I have already an answer to the case with uniform weights (Is the median of CDFs again a CDF?), but the method does not seem to work for the more general case. Maybe it is also not true and there exists a counterexample.

1

There are 1 best solutions below

0
On

I think I was able to find an answer to my problem by considering different cases:

As in the answer to the question Is the median of CDFs again a CDF?, we set $F_i(x) = x_i, F_i(y) = y_i, i=1,\dots,N$ and take two permutations $\sigma$ and $\tau$ such that $x_{\sigma(1)}\leq \dots \leq x_{\sigma(N)}$ and $y_{\tau(1)} \leq y_{\tau(2)} \leq \dots \leq y_{\tau(N)}$. Furthermore, we set $$k_\sigma = \min\left\lbrace k \in \{1,\dots,N\}\colon \sum_{i=1}^{k} a_{\sigma(i)}\right\rbrace, \, k_\tau = \min\left\lbrace k \in \{1,\dots,N\}\colon \sum_{i=1}^{k} a_{\tau(i)}\right \rbrace.$$

We want to show that $F(x)=x_{\sigma(k_\sigma)}\leq y_{\tau(k_\tau)}=F(y).$

If $k_\sigma \leq k_\tau$, then by the answer of Is the median of CDFs again a CDF?, we have: $$ x_{\sigma(k_\sigma)} \leq x_{\sigma(k_\tau)} \leq y_{\tau(k_\tau)}.$$ If $k_\sigma > k_\tau$, we consider two cases:

  1. In case $\left\lbrace \tau(i)\colon i \in \lbrace 1, \dots k_\tau \rbrace \right \rbrace \not\subset \left\lbrace \sigma(i)\colon i \in \lbrace 1, \dots k_\sigma-1 \rbrace \right \rbrace$, there must be $k^\prime \in \lbrace k_\sigma,\dots,N\rbrace$ with $\sigma(k^\prime) \in \left\lbrace \tau(i)\colon i \in \lbrace 1, \dots k_\tau \rbrace \right \rbrace,$ so we have $$ x_{\sigma(k_\sigma)} \leq x_{\sigma(k^\prime)} \leq y_{\sigma(k^\prime)} \leq \max_{1\leq i\leq k_\tau} y_{\tau(i)} = y_{\tau(k_\tau)}. $$
  2. In case $\left\lbrace \tau(i)\colon i \in \lbrace 1, \dots k_\tau \rbrace \right \rbrace \subset \left\lbrace \sigma(i)\colon i \in \lbrace 1, \dots k_\sigma-1 \rbrace \right \rbrace$, we obtain using the definition of $k_\sigma$: $$ \frac{1}{2} > \sum_{i=1}^{k_\sigma -1} a_{\sigma(i)} \geq \sum_{i=1}^{k_\tau} a_{\tau(i)}. $$ However, this is a contradiction to the definition of $k_\tau$ and thus, this case is not valid.

Therefore, we always have $x_{\sigma(k_\sigma)}\leq y_{\tau(k_\tau)}$ and the claim is shown.