Sinkhorn theorem for doubly stochastic matrices

1.3k Views Asked by At

I was reading something about doubly stochastic matrices and got stuck while reading the original proof of the uniqueness part of the Sinkhorn theorem. I'm not able to understand the logic. Could someone help me in figuring it out? I state below the theorem and the proof as written in the original paper.


Theorem: To a given strictly positive $N \times N$ matrix $A$ corresponds exactly one doubly stochastic matrix $T=D_1AD_2$ where $D_1$ and $D_2$ are diagonal matrices with positive diagonals. The matrices $D_1$ and $D_2$ are themselves unique up to a scalar factor.


Uniqueness proof: if there exist two different pairs of diagonal matrices $C_1, C_2$ and $D_1, D_2$ such that both $C_1 A C_2$ and $D_1 A D_2$ are doubly stochastic, then this means that there exist a positive doubly stochastic matrix $P$ and matrices

$$B_1 = \mbox{diag}[b_{11}, b_{12},\dots, b_{1N}]$$

and

$$B_2 = \mbox{diag}[b_{21}, b_{22}, \dots, b_{2N}]$$

which are not multiple of identity matrix, for which $B_1 P B_2$ is also doubly stochastic. But this is impossible since by convexity one obtains:

$$\min_jb_{2j} \le \frac{1}{b_{1i}} \le \max_j b_{2j}$$

and

$$\min_ib_{1j}\le \frac{1}{b_{2j}} \le \max_jb_{1i}$$

which leads to a contradiction if $b_{1i} b_{2j} \ne 1$ for some $i$ and $j$. It follows that $C_1 = p D_1$, $C_2 = p^{-1} D_2$ for some $p > 0$.


What I do not understand: why from the existence of $C_1,C_2$ and $D_1,D_2$ it will exist $P$, $B_1$ and $B_2$? What is the convexity argument? And why there is a contradiction? And how does it imply the thesis?

Thanks.

2

There are 2 best solutions below

5
On BEST ANSWER

Clearly, the author means $P=D_1AD_2,\ B_1=C_1D_1^{-1},\ B_2=D_2^{-1}C_2$.

I'm not sure what he/she exactly meant by "convexity", but here is an alternative proof. Let $b_{1i_1} = \min_i b_{1i}$ and $b_{2j_2} = \max_j b_{2j}$. Observe that the $(i,j)$-th entry of $B_1PB_2$ is $b_{1i}b_{2j}p_{ij}$. Since $B_1PB_2$ is row stochastic, it follows that $\sum_j b_{1i}b_{2j}p_{ij}=1$ and in turn $\sum_j b_{2j}p_{ij}=\frac1{b_{1i}}$ for every $i$. Therefore, $$ b_{2j_2} = b_{2j_2}\sum_j p_{i_1j} \ge \sum_j b_{2j}p_{i_1j} = \frac1{b_{1i_1}}.\tag{1} $$ By similar reasoning, if we consider the column stochasticity of $B_1PB_2$ instead, we get $$ b_{1i_1} = b_{1i_1}\sum_i p_{ij_2} \le \sum_i b_{1i}p_{ij_2} = \frac1{b_{2j_2}}.\tag{2} $$ Therefore $b_{1i_1}b_{2j_2}=1$ and equalities must hold in both $(1)$ and $(2)$. Consequently, all $b_{2j}$s are identical to each other and all $b_{1i}$s are equal. That is, $B_1,B_2$ are scalar multiples of the identity matrix (and in fact, $B_2=B_1^{-1}$). Since $B_1=C_1D_1^{-1}$ and $B_2=D_2^{-1}C_2$, it follows that $D_1,D_2$ are respectively scalar multiples of $C_1$ and $C_2$.

0
On

Let $P = (p_{ij})$ be a doubly stochastic matrix, and $C = (c_{ij})$ another matrix. Consider their Hadamard product $P\odot C=(p_{ij} c_{ij})$. We have

$$\sum_{j} p_{ij} c_{ij}\le \max_{j} c_{ij}$$ and so

$$\min_i (\sum_{j} p_{ij} c_{ij}) \le \min_i \max_i c_{ij} $$

Similarly

$$\max_j \min_i c_{ij} \le \max_j (\sum_{i} p_{ij} c_{ij})$$

Now let's assume moreover that

$$\min_i (\sum_{j} p_{ij} c_{ij}) = \max_j (\sum_{i} p_{ij} c_{ij}) ( = s) $$

( in general, it is $\le$, since sum by rows equals sum by columns). This is equivalent to all the rows and all the columns of $(p_{ij} c_{ij})$ has the same sum $s$. We conclude

$$\max_j \min_i c_{ij} \le s \le \min_i \max_j c_{ij}$$

Now the above inequality for the extremes is not a new one. However, let's assume moreover that

$$\max_j \min_i c_{ij}= \min_i \max_j c_{ij}$$

in other words, the matrix $(c_{ij})$ has a saddle point $(i_0, j_0)$ ( and also that all $p_{ij}\ne 0$, so $>0$). From the above we conclude thas the $i_0$ row and the $j_0$ column are constant ( $=s$).

Now we can prove uniqueness. Assume that $P$ is doubly stochastic with all entries $>0$, $C= (\alpha_i \cdot \beta_j)$, with $\alpha_i$, $\beta_i\ge 0$, and the matrix $P\odot C$ has all the row sums and column sums equal. Now the matrix $C$ has a saddle point (indeed, $\min_i \max_j \alpha_i \beta_j = \min_i \alpha_i \cdot \max_j \beta_j$), so $C$ has one constant row and one constant column. We conclude that all of the entries of $C$ are the same.