Closed Form 1st and 2nd Derivative of a Single Variable Function with a Matrix Form

89 Views Asked by At

I have a function of a single variable $ g \left( f \right) : \mathbb{R} \to \mathbb{R} $ with the following form:

$$ g \left( f \right) = \frac{1}{2} {\left\| \boldsymbol{X} {\left( \boldsymbol{X}^{T} \boldsymbol{X} \right)}^{-1} \boldsymbol{X}^{T} \boldsymbol{y} - \boldsymbol{y} \right\|}_{2}^{2} $$

Where

  • The matrix $ \boldsymbol{X} \in \mathbb{R}^{N \times 2} $ is defined as $ \boldsymbol{X} \left( f \right) = \begin{bmatrix} \sin \left[ 2 \pi f 0 \right] & \cos \left[ 2 \pi f 0 \right] \\ \sin \left[ 2 \pi f 1 \right] & \cos \left[ 2 \pi f 1 \right] \\ \sin \left[ 2 \pi f 2 \right] & \cos \left[ 2 \pi f 2 \right] \\ \vdots & \vdots \\ \sin \left[ 2 \pi f \left( N - 1 \right) \right] & \cos \left[ 2 \pi f \left( N - 1 \right) \right] \end{bmatrix} $ (Namely the argument of the harmonic signal is $ 2 \pi f n, \; n \in \left\{ 0, 1, 2, \ldots, N - 1 \right\} $.
  • The vector $ \boldsymbol{y} \in \mathbb{R}^{N} $ is given.

I am looking for a closed form expression of the 1st and 2nd derivative in order to optimize it using the Newton method.

How can it be derived efficiently?

2

There are 2 best solutions below

0
On

Solution for the 1st Derivative

By the chain rule:

$$ \frac{\mathrm{d} g }{\mathrm{d} f} = \left \langle \frac{\mathrm{d} g }{\mathrm{d} \boldsymbol{X}}, \frac{\mathrm{d} \boldsymbol{X} }{\mathrm{d} f} \right \rangle $$

Calculating $ \frac{\mathrm{d} \boldsymbol{X} }{\mathrm{d} f} $ is straight forward as:

$$ \frac{\mathrm{d} \boldsymbol{X} }{\mathrm{d} f} = \begin{bmatrix} 2 \pi 0 \cos \left[ 2 \pi f 0 \right] & -2 \pi 0 \sin \left[ 2 \pi f 0 \right] \\ 2 \pi 1 \cos \left[ 2 \pi f 1 \right] & -2 \pi 1 \sin \left[ 2 \pi f 1 \right] \\ 2 \pi 2 \cos \left[ 2 \pi f 2 \right] & -2 \pi 2 \sin \left[ 2 \pi f 2 \right] \\ \vdots & \vdots \\ 2 \pi \left( N - 1 \right) \cos \left[ 2 \pi f \left( N - 1 \right) \right] & -2 \pi \left( N - 1 \right) \sin \left[ 2 \pi f \left( N - 1 \right) \right] \end{bmatrix} $$

The other derivative:

$$ \frac{\mathrm{d} g }{\mathrm{d} \boldsymbol{X}} = \boldsymbol{T}_{3} \boldsymbol{T}_{2} - \boldsymbol{T}_{1} \boldsymbol{T}_{4} + \boldsymbol{X} \boldsymbol{T}_{0} \boldsymbol{X}^{T} \boldsymbol{T}_{3} \boldsymbol{T}_{2} + \boldsymbol{y} \boldsymbol{T}_{4} $$

Where

  • $ \boldsymbol{T}_{0} = {\left( \boldsymbol{X}^{T} \boldsymbol{X} \right)}^{-1} $.
  • $ \boldsymbol{T}_{1} = \boldsymbol{X} \boldsymbol{T}_{0} \boldsymbol{X}^{T} \boldsymbol{y} $.
  • $ \boldsymbol{T}_{2} = \boldsymbol{y}^{T} \boldsymbol{X} \boldsymbol{T}_{0} $.
  • $ \boldsymbol{T}_{3} = \boldsymbol{T}_{1} - \boldsymbol{y} $.
  • $ \boldsymbol{T}_{4} = \left( \boldsymbol{T}_{2} \boldsymbol{X}^{T} - \boldsymbol{y}^{T} \right) \boldsymbol{X} \boldsymbol{T}_{0} $.

I am still looking for the 2nd derivative.

1
On

$ \def\g{\gamma} \def\l{\lambda} \def\o{{\tt1}} \def\e{{e}} \def\Z{Z^T} \def\X{X^{\bf+}} \def\XX{{\X}^T} \def\BR#1{\Big(#1\Big)} \def\LR#1{\left(#1\right)} \def\op#1{\operatorname{#1}} \def\sym#1{\op{sym}\LR{#1}} \def\skew#1{\op{skew}\LR{#1}} \def\rank#1{\op{rank}\LR{#1}} \def\diag#1{\op{diag}\LR{#1}} \def\Diag#1{\op{Diag}\LR{#1}} \def\trace#1{\op{Tr}\LR{#1}} \def\frob#1{\left\| #1 \right\|_F} \def\qiq{\quad\implies\quad} \def\p{\partial} \def\grad#1#2{\frac{\p #1}{\p #2}} \def\deriv#1#2{\frac{d #1}{d #2}} \def\m#1{\left[\begin{array}{r|r}#1\end{array}\right]} \def\c#1{\color{red}{#1}} \def\CLR#1{\c{\LR{#1}}} \def\gradLR#1#2{\LR{\grad{#1}{#2}}} $Let an uppercase letter denote a matrix, lowercase a vector, Greek a scalar, and $\{\e_k\}$ are the standard basis vectors. And use a colon to denote the Frobenius product $$\eqalign{ A:B &= \sum_{i=1}^m\sum_{j=1}^n A_{ij}B_{ij} \;=\; \trace{A^TB} \\ A:A &= \frob{A}^2 \qquad \{ {\rm Frobenius\;norm} \}\\ A:B &= B:A \;=\; B^T:A^T \\ \LR{AB}:C &= A:\LR{CB^T} \;=\; B:\LR{A^TC} \\ \sym{A}:B &= A:\sym{B} \\ \skew{A}:B &= A:\skew{B} \\ }$$ Use the Greek letter $\phi\ ({\rm rather\ than}\ f)$ for the independent scalar variable, and define the vectors $$\eqalign{ q &= \m{0&1&2&\ldots&\LR{n-\o}}^T \\ }$$ $$\eqalign{ w &= 2\pi q\,\phi \qiq dw = 2\pi q\;d\phi \\ c &= \cos(w) \qiq C = \Diag{c} \\ s &= \sin(w) \qiq S = \Diag{s} \\ ds &= C\,dw \;\qiq dc = -S\,dw \\ Q &= \Diag{q} \\ }$$ and the matrices $$\eqalign{ Z &= c\e_\o^T \;-\; s\e_2^T \\ Y &= yy^T \\ X &= s\e_\o^T \;+\; c\e_2^T \\ dX &= {\c{ds}\,\e_\o^T - \c{dc}\,\e_2^T} \\ &= 2\pi\LR{Cq\e_\o^T - Sq\e_2^T}\,d\phi \\ &= 2\pi\LR{Qc\e_\o^T - Qs\e_2^T}\,d\phi \\ &= 2\pi QZ\,d\phi \\ }$$ and the Hat Matrix (and its complement) $$\eqalign{ H &= X\LR{X^TX}^{-1}X^T \;=\; X\X \\ K &= {I-H} \\ \\ dK &= -dH \\ dH &= \LR{K\:dX\:\X} \;+\; \LR{K\:dX\:\X}^T \\ &= 2\:\sym{K\:dX\:\X} \\ }$$ Note that both $H$ and $K$ are orthoprojectors $$\eqalign{ H^2 &=\; H \;=\; H^T \\ K^2 &=\; K \;=\; K^T \\ KH &=\;\, 0 \;\;=\; HK \\ KX &=\;\, 0 \\ HX &=\: X \\ }$$ For later convenience, define the matrix $$\eqalign{ M &= \X\,YK \\ }$$ Write the scalar-valued function $(\g\ {\rm rather\ than\ g})$ and calculate its derivative wrt $\phi$ $$\eqalign{ \g &= \tfrac12\:Ky:Ky \\ &= \tfrac12\:Y:K \\ d\g &= \tfrac12\:Y:dK \\ &= -\tfrac12\:Y:2\sym{K\,dX\:\X} \\ &= -\c{KY\XX}:dX \\ &= -\c{M^T}\!:dX \\ &= -M^T\!:2\pi QZ\:d\phi \\ \deriv{\g}{\phi} &= -2\pi QM^T\!:Z \\ &= +2\pi MQ:\LR{\e_2\,s^T - \e_\o\,c^T} \\ }$$ Give this scalar-valued derivative a name $(\l)$ and calculate its differential $$\eqalign{ d\l &= 2\pi\:\c{dM}\:Q:\LR{\e_2\,s^T - \e_\o\,c^T} \;+\; 2\pi QM^T\!:\LR{\c{ds}\:\e_2^T - \c{dc}\:\e_\o^T} \\ }$$ To find the second derivative, expand the various differentials in terms of $d\phi$.
It will get messy but not impossibly so.
You'll need the differential of the pseudoinverse, which is ugly but well-known $$\small{ d\X \;=\; -\X\,dX\;\X \;+\; \X\XX dX^T\LR{I-X\X} \;+\; \LR{I-\X X}dX^T \XX\X }$$ In your problem $\,\rank{X}=2,\,$ so the last term disappears leaving $$\small{ d\X \;=\; -\X\,dX\;\X \;+\; \X\XX dX^T\,K \qquad \qquad \qquad \qquad \qquad \qquad \qquad \; }$$

Update

First define $\,W=Z\X,\:$ then assuming I didn't make any dumb algebra errors $$\small\eqalign{ \def\W{W^T} \frac{d^2\g}{d\phi^2} &= 4\pi^2\:Q:\BR{XM+\LR{\W QKY-KYKW-KY\W Q-KQW Y}\W} \\ }$$ You can also reformulate the first derivative as $$\small\eqalign{ \def\W{W^T} \frac{d\g}{d\phi} &= -2\pi\:Q:WYK \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad }$$