Gradient of Coefficient of Variation of weighted geometric mean

Question

Gradient of Coefficient of Variation of weighted geometric mean

212 Views Asked by Bumbble Comm At 24 Apr 2026 - 8:02

This question is another version of the one here which I've previously asked. the difference is instead of WM here we have exp(WM). exp(W×log(M)) is equivalent to weighted geometric mean of M when W sums to 1 (wikipedia) (the exp operation is applied element-wise). I just replaced log(M) with M for simplicity.
I have a d×n matrix called M. What is the best 1×d W that minimizes CoV(exp(WM)) which is Coefficient of Variance of exp(W×M), considering that W sums to 1

$$\underset{W}{\operatorname{argmin}}\frac{SD(exp(WM))}{Mean(exp(WM))}$$ $$\sum_{i=1}^{d} w_{i}=1$$

I can rewrite this as: $$\underset{W}{\operatorname{argmin}} \frac{\sqrt{\frac{1}{n}\left\|exp(WM)-\frac{1}{n} exp(WM) A\right\|_{2}^{2}}}{\frac{1}{n} exp(WM) A}$$ $$\sum_{i=1}^{d} w_{i}=1$$

A is a n×1 matrix of ones.
I hope I get a closed form for W* or have the gradient so i can use gradient descent to find the minimum.

Edit: Here is my try based on this answer. but couldn't solve it. I think the main problem is with the Hadamard Product ($\circ$) which appears when differentiating the exp(WM).

Introduce an unconstrained vector $x$ and use it to construct a column vector $w$ which satisfies the constraint. $$\eqalign{ &w = \frac{x}{{\tt1}^Tx} \quad\implies\quad {\tt1}^Tw = \frac{{\tt1}^Tx}{{\tt1}^Tx} \doteq {\tt1} \\ }$$ Then for algebraic convenience, define some auxiliary variables $$\eqalign{ {\tt1} &= A,\quad &J = AA^T \\ C &= I-\tfrac{1}{n}J\quad &({\rm Centering\, Matrix}) \\ w &= W^T \quad &({\rm column\, vector\, constructed\, from\, }x)\\ B &= d×1\,\,matrix\,of\,ones \\ Q &= M\circ By^T \\ y &= exp(M^Tw) \quad&\implies dy = M^T\circ exp(M^Tw)B^Tdw = Q^Tdw \\ z &= Cy \quad&\implies dz = CQ^Tdw \\ \alpha &= {\tt1}^Tx \quad&\implies d\alpha = {\tt1}^Tdx \\ \beta &= {\tt1}^Ty \quad&\implies d\beta = {\tt1}^Tdy = {\tt1}^TQ^Tdw \\ w &= \alpha^{-1}x \quad&\implies dw = \alpha^{-1}dx - x\alpha^{-2}d\alpha \\ & \quad&\implies dw = \alpha^{-1}(I - w{\tt1}^T)\,dx \\ }$$ Note that $C^T=C=C^2\;$ and $\;\beta={\tt1}^Texp(M^Tw)={exp(w^TM){\tt1}}=exp(WM)A$
these properties will be used in several of the steps below.

Use the new variables to simplify the vector appearing in the numerator. $$\eqalign{ &\Big(exp(WM)-\tfrac{1}{n}exp(WM)AA^T\Big)^T = \Big(exp(M^Tw) - \tfrac{1}{n}Jexp(M^Tw)\Big) = Cy = z \\ }$$ Call the objective function $\phi$, and start by differentiating its square. $$\eqalign{ \phi^2 &= n\beta^{-2}z^Tz \\ 2\phi\,d\phi &= 2n\beta^{-2}z^Tdz - 2n\beta^{-3}z^Tz\,d\beta \\ d\phi &= n\phi^{-1}\beta^{-3}z^T\Big(\beta\,dz - z\,d\beta\Big) \\ &= n\phi^{-1}\beta^{-3}z^T\Big(\beta\,CQ^T - z{\tt1}^TQ^T\Big)\,dw \\ &= n\phi^{-1}\alpha^{-1}\beta^{-3}z^T\Big(\beta\,CQ^T - z{\tt1}^TQ^T\Big)\,\Big(I - w{\tt1}^T\Big)\,dx \\ \frac{\partial\phi}{\partial x} &= n\phi^{-1}\alpha^{-1}\beta^{-3}\Big(I - {\tt1}w^T\Big)\,\Big(\beta\,QC - Q{\tt1}z^T\Big)z \\ }$$ Set the gradient to zero. $$\eqalign{ \Big({\tt1}w^T\Big)\,\Big(\beta\,QC - Q{\tt1}z^T\Big)\,z &= I\Big(\beta\,QC - Q{\tt1}z^T\Big)\,z \\ }$$ Eliminate $z$ . $$\eqalign{ \Big({\tt1}w^T\Big)\,\Big(\beta QC - Q{\tt1}y^TC\Big)\,Cy &= \Big(\beta QC - Q{\tt1}y^TC\Big)\,Cy \\ \Big({\tt1}w^T\Big)\,Q\Big(\beta I - {\tt1}y^T\Big)\,Cy &= Q\Big(\beta I - {\tt1}y^T\Big)\,Cy \\ }$$

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Accepted Answer

$ \def\a{\alpha} \def\b{\beta} \def\c#1{\color{red}{#1}} \def\o{{\large\tt1}} \def\g{\sigma} \def\L{\left}\def\R{\right}\def\LR#1{\L(#1\R)} \def\Diag#1{\operatorname{Diag}\LR{#1}} $You were doing great until the final step where you "eliminated $z$".
That's not valid, so you must find a different way to solve the equation.
Here is an idea, but not a complete solution.

Define the matrix $$H = {\b QC -Q\o z^T}$$ Then the zero gradient condition becomes $$Hz = \LR{\o w^T}Hz \;=\; \LR{w^THz}\o \;=\; \g\o$$ This eliminates $z$ from one side of the equation $$\eqalign{ \LR{\beta QC -Q\o z^T}z &= \g\o }$$ Other substitutions can be used $$\eqalign{ \b &= \o^Ty \\ Cz &= C^2y = {Cy = z} \\ Y &= \Diag{y} \quad&\implies\quad Q=MY,\quad z=Cy=CY\o \\ }$$ to rewrite the equation entirely in terms of $y$ $\,($or $Y)$ $$\eqalign{ \g\o_d &= M\Diag{y}\Big(\LR{\o^T_ny}C\Diag{y} -\LR{y^TCy}I_n\Big)\o_n \\ &= M\Big(\LR{\o^T_n{Y}\o_n}\c{YCY} - \LR{\o^T_n\c{YCY}\o_n}{Y}\Big)\o_n \\ &= S\o_n \\ S\o_n &= \g\o_d \quad\implies\quad {\frac{S\o_n}{\|S\o_n\|}} = \frac{\o_d}{\sqrt d} \\ }$$ This looks promising. It almost looks like an eigenvalue equation, however...
$S$ is rectangular, so the $\o$ vectors on the RHS and LHS have different lengths.

The next step is to come up with an iteration formula akin to the classic power iteration to solve for a $y$ vector (or $S$ matrix) such that the pseudo-eigenvalue equation is satisfied.

Update

Since a closed-form eigenvalue-like solution seems improbable, and since we already know how to calculate the gradient and the cost function for any value of $x,\,$ i.e. $$\eqalign{ \phi(x) &= \LR{n\beta^{-2}z^Tz}^{1/2} \\ g(x) &= n\phi^{-1}\a^{-1}\b^{-3}\Big(I-{\tt1}w^T\Big)\,\Big(\b\,QC - Q{\tt1}z^T\Big)z \\ }$$ a better idea is a numerical solution using a gradient-based method.

The Barzilai-Borwein method is particularly simple and effective.
Initialize $$\eqalign{ x_0 &= random \qquad\qquad\qquad\qquad\qquad\quad \\ }$$ First step $$\eqalign{ g_0 &= g(x_0) \\ x_1 &= x_0 - \LR{\frac{0.05\;\phi(x_0)}{g_0^Tg_0}} g_0 \qquad\qquad\quad \\ k &= 1\\ }$$ Subsequent steps $$\eqalign{ g_k &= g(x_k) \\ x_{k+1} &= x_k - \LR{\frac{\LR{x_k - x_{k-1}}^T\LR{g_k - g_{k-1}}}{\LR{g_k - g_{k-1}}^T\LR{g_k - g_{k-1}}}}g_k \\ k &= k+1 \\ }$$ Stop when $g_k\approx 0$.

Gradient of Coefficient of Variation of weighted geometric mean

There are 1 best solutions below

Related Questions in LINEAR-ALGEBRA

Related Questions in OPTIMIZATION

Related Questions in MATRIX-CALCULUS

Related Questions in GRADIENT-DESCENT

Trending Questions

Popular # Hahtags

Popular Questions