derivation of vector norm

140 Views Asked by At

what would be the differentiation of this equation :-

$F(A) = \sum_{i} \left \| Y_{i} - AB_{i} \right \|^{2} + \lambda \left \| A - C \right \|^{2}$

wrt to A .

Y is a column vector and B is column vector . C is a known matrix and lambda is a constant .

PS: $i$ will lie in a finite range .

1

There are 1 best solutions below

3
On

I'll assume all norms are $2$-norms in the appropriate Euclidean space. Then we start by calculating several straightforward differentials:

  1. Take $y\in\mathbb{R}^n$ and let $f:\mathbb{R^n}\to\mathbb{R}^n$ be defined $f(x) = \alpha x + y$. Then $Df(x)\equiv \alpha I_n$.
  2. Take $y\in\mathbb{R}^n$ and let $g:\mathbb{R}^n\to\mathbb{R}$ be defined $g(x) = \langle x,y\rangle$. Then $Dg(x)u = \langle u,y\rangle$.
  3. Let $h:\mathbb{R}^n\to\mathbb{R}$ be defined $h(x) = \langle x,x\rangle$. Then $Dh(x)u = \langle u,x\rangle + \langle x,u\rangle$.
  4. Take $y\in\mathbb{R}^n$ and let $r:\mathbb{R}^{n\times n}\to\mathbb{R}^n$ be defined $r(X) = Xy$. Then $Dr(X)U = Uy$.

With these in hand, we can utilize the chain-rule to find the following differentials:

Let $\varphi:\mathbb{R}^{n\times n}\to\mathbb{R}$ be defined $$\varphi(X) = \|y_1 - Xy_2\|_2^2 = \langle y_1 - Xy_2,y_1 - Xy_2\rangle_{\mathbb{R}^n} =\\ = \|y_1\|_2^2 - 2\langle Xy_2, y_1\rangle_{\mathbb{R}^n} + \langle Xy_2,Xy_2\rangle_{\mathbb{R}^n} =\\ = \|y_1\|_2^2 + f\circ g\circ r(X) + h\circ r(X),$$ with the appropriate constants. Then we know $$D\varphi(X)U = -2\langle Uy_2,y_1\rangle_{\mathbb{R}^n} + \langle Uy_2,Xy_2\rangle_{\mathbb{R}^n} + \langle Xy_2,Uy_2\rangle_{\mathbb{R}^n} = 2\langle Uy_2, Xy_2-y_1\rangle_{\mathbb{R}^n}.$$

We also note that for $\psi:\mathbb{R}^{n\times n}\to\mathbb{R}$ defined $$\psi(X) = \lambda\|X-C\|_2^2 = \lambda\langle X-C,X-C\rangle_{\mathbb{R}^{n^2}} = \lambda\left(\langle X,X\rangle_{\mathbb{R}^{n^2}} -2\langle X,C\rangle_{\mathbb{R}^{n^2}} + \|C\|_2^2\right) =\\ = f_1\circ h(X) + f_2\circ g(X) + \lambda\|C\|_2^2,$$ with appropriate constants and two distinct instances of $f$, then $$D\psi(X)U = \lambda\left(\langle U,X\rangle_{\mathbb{R}^{n^2}} + \langle X,U\rangle_{\mathbb{R}^{n^2}}\right) - 2\lambda\langle U,C\rangle_{\mathbb{R}^{n^2}} = 2\lambda\langle U,X-C\rangle_{\mathbb{R}^{n^2}}.$$

Combining these results (with linearity of the differential) gives us $$DF(X)U = 2\lambda\langle U,X-C\rangle_{\mathbb{R}^{n^2}} + 2\sum_i\langle Ub_i, Xb_i-y_i\rangle_{\mathbb{R}^n}$$