Gradient of Coefficient of Variance

288 Views Asked by At

I have a d×n matrix called M. What is the best 1×d W that minimizes CoV(WM) which is Coefficient of Variance of W×M, considering that W sums to 1

$$\underset{W}{\operatorname{argmin}}\frac{SD(WM)}{Mean(WM)}$$ $$\sum_{i=1}^{d} w_{i}=1$$

I can rewrite this as: $$\underset{W}{\operatorname{argmin}} \frac{\sqrt{\frac{1}{n}\left\|W M-\frac{1}{n} W M A\right\|_{2}^{2}}}{\frac{1}{n} W M A}$$ $$\sum_{i=1}^{d} w_{i}=1$$

A is a n×1 matrix of ones.
I hope I get a closed form for W* or have the gradient so i can use gradient descent to find the minimum.

1

There are 1 best solutions below

2
On BEST ANSWER

$\def\o{{\tt1}}\def\r#1{\color{red}{#1}}\def\g#1{\color{green}{#1}}$ The following derivation is complicated, but results in a closed-form solution, i.e. $$W^T = (MCM^T)^+MA + \Big(I-(MCM^T)^+MCM^T\Big)q$$ where $C$ is the Centering Matrix $(I-\frac{1}{n}AA^T),\;H^+$ denotes the pseudo-inverse of a matrix, and $q$ is an arbitrary vector (which is zero for the least-norm solution).


Introduce an *unconstrained* vector $x$ and use it to construct a column vector $w$ which satisfies the constraint. $$\eqalign{ &w = \frac{x}{{\tt1}^Tx} \quad\implies\quad {\tt1}^Tw = \frac{{\tt1}^Tx}{{\tt1}^Tx} \doteq {\tt1} \\ }$$ Then for algebraic convenience, define some auxiliary variables $$\eqalign{ {\tt1} &= A,\quad &J = AA^T \\ C &= I-\tfrac{1}{n}J\quad &({\rm Centering\, Matrix}) \\ w &= W^T \quad &({\rm column\, vector\, constructed\, from\, }x)\\ y &= M^Tw \quad&\implies dy = M^Tdw \\ z &= Cy \quad&\implies dz = CM^Tdw \\ \alpha &= {\tt1}^Tx \quad&\implies d\alpha = {\tt1}^Tdx \\ \beta &= {\tt1}^Ty \quad&\implies d\beta = {\tt1}^Tdy = {\tt1}^TM^Tdw \\ w &= \alpha^{-1}x \quad&\implies dw = \alpha^{-1}dx - x\alpha^{-2}d\alpha \\ & \quad&\implies dw = \alpha^{-1}(I - w{\tt1}^T)\,dx \\ }$$ Note that $C^T=C=C^2\;$ and $\;\beta={\tt1}^TM^Tw=\r{w^TM{\tt1}}=WMA$
these properties will be used in several of the steps below.

Use the new variables to simplify the vector appearing in the numerator. $$\eqalign{ &\Big(WM-\tfrac{1}{n}WMAA^T\Big)^T = \Big(M^Tw - \tfrac{1}{n}JM^Tw\Big) = Cy = z \\ }$$ Call the objective function $\phi$, and start by differentiating its square. $$\eqalign{ \phi^2 &= n\beta^{-2}z^Tz \\ 2\phi\,d\phi &= 2n\beta^{-2}z^Tdz - 2n\beta^{-3}z^Tz\,d\beta \\ d\phi &= n\phi^{-1}\beta^{-3}z^T\Big(\beta\,dz - z\,d\beta\Big) \\ &= n\phi^{-1}\beta^{-3}z^T\Big(\beta\,CM^T - z{\tt1}^TM^T\Big)\,dw \\ &= n\phi^{-1}\alpha^{-1}\beta^{-3}z^T\Big(\beta\,CM^T - z{\tt1}^TM^T\Big)\,\Big(I - w{\tt1}^T\Big)\,dx \\ \frac{\partial\phi}{\partial x} &= n\phi^{-1}\alpha^{-1}\beta^{-3}\Big(I - {\tt1}w^T\Big)\,\Big(\beta\,MC - M{\tt1}z^T\Big)z \\ }$$ Set the gradient to zero. $$\eqalign{ \Big({\tt1}w^T\Big)\,\Big(\beta\,MC - M{\tt1}z^T\Big)\,z &= I\Big(\beta\,MC - M{\tt1}z^T\Big)\,z \\ }$$ Eliminate $z$ in favor of $w$. $$\eqalign{ \Big({\tt1}w^T\Big)\,\Big(\beta MC - M{\tt1}w^TMC\Big)\,CM^Tw &= \Big(\beta MC - M{\tt1}w^TMC\Big)\,CM^Tw \\ \Big({\tt1}w^T\Big)\,\Big(\beta I - M{\tt1}w^T\Big)\,MCM^Tw &= \Big(\beta I - M{\tt1}w^T\Big)\,MCM^Tw \\ \Big(\r{\beta}{\tt1}w^T - {\tt1}\r{w^TM{\tt1}}w^T\Big)\,MCM^Tw &= \Big(\beta I - M{\tt1}w^T\Big)\,MCM^Tw \\ 0 &= \Big(\beta I - M{\tt1}w^T\Big)\,MCM^Tw \\ M{\tt1}w^T\g{MCM^Tw} &= \beta\,\g{MCM^Tw} \\ \Big((M{\tt1})w^T\Big)\g{v} &= \beta\g{v} \\ Bv &= \beta v \\ }$$ The last line is an eigenvalue equation. Since the matrix $B$ is rank-${\tt1}$, there is only one non-trivial eigenvector, which miraculously allows for a closed-form solution to the problem. $$\eqalign{ v &= M{\tt1} \qquad\qquad\big({\rm eigenvector\,of\,}B\big) \\ (MCM^T)w &= M{\tt1} \\ w &= (MCM^T)^+M{\tt1} + \Big(I-(MCM^T)^+MCM^T\Big)q \\ }$$ where $H^+$ denotes the pseudo-inverse of $H$ and $q$ is an arbitrary vector.