Need help on computing partial derivative of functions of matrices

Question

Need help on computing partial derivative of functions of matrices

72 Views Asked by Bumbble Comm At 13 Apr 2026 - 4:24

I need some guidance on the proof of one of the equations published by Belge, M., et al. 2002, Inverse Problems, 18(4), p.1161. It discusses a multi-constrained regularization approach:

$$\boldsymbol{f}^{*}\left(\boldsymbol{\alpha}\right) = \rm{arg\ } \underset{\it f}{\rm min} \left\{\|\boldsymbol{g} - \boldsymbol{H}\boldsymbol{f}\|_2^2+\sum\limits_{{\it{i}}=1}^{\it M}\alpha_{\it i}\Phi_{\it i}\left(\boldsymbol{R}_{\it i}\boldsymbol{f}\right)\right\}, \ \ \ \ \ \boldsymbol{R}_{\it i},\boldsymbol{H}\in \mathbb{R}^{{\it m}\times {\it n}}$$

where $M$ is the number of constraints, $\boldsymbol{\alpha} = \left[\alpha_1, \alpha_2,\cdots, \alpha_m\right]^T$, $\boldsymbol{R}_i$ are regularization operators, and $\alpha_i$ are the corresponding regularization parameters, $\Phi_i\left(\boldsymbol{R}_i\boldsymbol{f}\right)=\sum_{j=1}^{m}\phi_i\left(\left[\boldsymbol{R}_i\boldsymbol{f}\right]_j\right)$ and the notation $\left[\boldsymbol{R}_i\boldsymbol{f}\right]_j$ denotes the jth element of the vector $\boldsymbol{R}_i\boldsymbol{f}$. In addition, $\phi_i\left(t\right)$ is a continuously differentiable, convex, non-negative ($\phi_i\left(t\right) \geqslant 0, \forall t$) even function.

By taking the gradient with respect to $f$ and setting the result equal to zero we obtain the following first-order condition that must be satisfied by $f^*\left(\boldsymbol{\alpha}\right)$

$$\boldsymbol{K}_{f^*}\boldsymbol{f}^* = \boldsymbol{H}^T\boldsymbol{g}$$

where

$$\boldsymbol{K}_{f^*}=\boldsymbol{H}^T\boldsymbol{H}+\frac 12 \sum\limits_{i=1}^{M}\alpha_i\boldsymbol{R}_i^T \underset{k=1,\cdots,m}{\rm{diag}} \left[\frac{\phi'_i\left(\left[\boldsymbol{R}_i\boldsymbol{f}^*\right]_k\right)}{\left[\boldsymbol{R}_i\boldsymbol{f}^*\right]_k}\right]\boldsymbol{R}_i$$

Could someone kindly give me some hint how the following term has obtained? $\frac 12 \sum\limits_{i=1}^{M}\alpha_i\boldsymbol{R}_i^T \underset{k=1,\cdots,m}{\rm{diag}} \left[\frac{\phi'_i\left(\left[\boldsymbol{R}_i\boldsymbol{f}^*\right]_k\right)}{\left[\boldsymbol{R}_i\boldsymbol{f}^*\right]_k}\right]\boldsymbol{R}_i$.

I applied chain rule of composite functions but I can't get it right.

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Accepted Answer

If we apply a function, $\phi$, element-wise to a vector $w$, the result is a vector value $$v = \phi(w)$$ whose differential (also a vector) can be expressed using an elementwise (Hadamard) product $$\eqalign{ dv &= \phi'\circ dw \cr &= {\rm diag}(\phi')\,dw \cr }$$
For your problem, let $w=Rf$ and assume there is only one constraint.
Let's find the gradient of that constraint $$\eqalign{ \Phi &= 1^Tv \cr d\Phi &= 1^Tdv = 1^T{\rm diag}(\phi')\,dw = 1^T{\rm diag}(\phi')R\,df \cr \frac{\partial\Phi}{\partial f} &= R^T{\rm diag}(\phi')\,1 \cr }$$ Now let's look at the gradient of the original function plus the constraint $$\eqalign{ \lambda &= \|Hf-g\|^2 + \alpha\Phi \cr \frac{\partial\lambda}{\partial f} &= 2H^T(Hf-g) + \alpha R^T{\rm diag}(\phi')\,1 \cr }$$ Setting the gradient to zero yields $$\eqalign{ H^Tg &= H^THf + \frac{1}{2}\alpha R^T{\rm diag}(\phi')\,1 \cr }$$ Now the authors pull a stupid trick to substitute the $1$ on the far RHS with the following $$\eqalign{ 1 &= {\rm diag}\Big(\frac{1}{Rf}\Big)\,Rf \cr }$$ Leaving us with $$\eqalign{ H^Tg &= H^THf + \frac{1}{2}\alpha R^T{\rm diag}\Big(\frac{\phi'}{Rf}\Big)Rf \cr &= Kf \cr }$$ The function with multiple constraints has the same form, just put subscripts on the $(R,\phi,\alpha)$ symbols and sum them.

Need help on computing partial derivative of functions of matrices

There are 1 best solutions below

Related Questions in PARTIAL-DERIVATIVE

Related Questions in MATRIX-CALCULUS

Trending Questions

Popular # Hahtags

Popular Questions