how can I find D in a way that $max_D{e^T V(D)^T S V(D) e}$? And how can I find differentiation of a matrix with respect to a vector?

32 Views Asked by At

I am trying to maximize $max_D{e^T V(D)^T S V(D) e}$ with respect to D and find out D. e is a (3n+1)1 vector. V(D) is (3n+1)(3n+1) matrix like: V(D) = [ I & 0 & 0 & 0 \ 0 & D & 0 & 0 \ 0 & 0 & D & 0 \ 0 & 0 & 0 & 1] where D is a n*n matrix.

The way that comes to my mind is to find out the derivative of ${f= e^T V(D)^T S V(D) e}$ with respect to D by considering as ${B = V(D) e}$ follows using chain rule:

df/dD = df/dB * dB/dV * dV/dD = 2SB * d(Ve)/dV * dV/dD

However, I have a problem in calculating d(Ve)/dV and dV/dD. Because it is a kind of finding the derivative of a vector with respect to a matrix.

Do you think that it is a good idea to maximize f like this? How can I find the derivative of a vector with respect to a matrix?

I would be happy if anyone can help me.

1

There are 1 best solutions below

2
On

$ \def\R#1{{\mathbb R}^{#1}} \def\o{{\tt1}} \def\LR#1{\left(#1\right)} \def\op#1{\operatorname{#1}} \def\unvc#1{\op{Unvec}\LR{#1}} \def\vc#1{\op{vec}\LR{#1}} \def\trace#1{\op{Tr}\LR{#1}} \def\frob#1{\left\| #1 \right\|_F} \def\qiq{\quad\implies\quad} \def\p{\partial} \def\grad#1#2{\frac{\p #1}{\p #2}} \def\c#1{\color{red}{#1}} \def\CLR#1{\c{\LR{#1}}} \def\m#1{\left[\begin{array}{r}#1\end{array}\right]} $Let's use a colon to denote the extremely useful Frobenius product $$\eqalign{ A:B &= \sum_{i=1}^m\sum_{j=1}^n A_{ij}B_{ij} \;=\; \trace{A^TB} \\ A:A &= \frob{A}^2 \qquad \{ {\rm Frobenius\;norm} \}\\ A:B &= B:A \;=\; B^T:A^T \\ \LR{AB}:C &= A:\LR{CB^T} \;=\; B:\LR{A^TC} \\ }$$ Let ${I_n,Z_n\in\R{n\times n}}$ denote the identity matrix and zero matrix, and ${z\in\R{n}}$ the zero vector.

Use them to construct the following matrix constants $$\eqalign{ B_1 &= \m{I_n\\Z_n\\Z_n\\z^T},\quad &B_2 = \m{Z_n\\I_n\\Z_n\\z^T},\quad B_3 = \m{Z_n\\Z_n\\I_n\\z^T},\quad b_4 = \m{z\\z\\z\\\o} \\\\ B_k &\in\R{m\times n},&b_4\in\R{m},\qquad m= 3n+\o \\ }$$ which allow us to write a nice algebraic expression for $V$ in terms of $D$ $$\eqalign{ V &= \;B_2\;D\;B_2^T +\; B_3\;D\;B_3^T \;\;+\; B_1B_1^T + b_4b_4^T \\ \c{dV} &\c{= B_2\:dD\:B_2^T + B_3\:dD\:B_3^T} \\ }$$ Use the Frobenius product to write a nicer expression for the objective function $$\eqalign{ f &= e^TV^TSVe \\ &= \trace{e^TV^TSVe} \\ &= \trace{ee^TV^TSV} \\ &= ee^T:V^TSV \\ }$$ Now calculate the differential and thence the gradient of $f$ $$\eqalign{ df &= ee^T:\LR{dV^TSV + V^TS\;dV} \\ &= 2ee^T:V^TS\;dV \\ &= 2S^TVee^T:\c{dV} \\ &= 2S^TVee^T:\CLR{B_2\:dD\:B_2^T + B_3\:dD\:B_3^T} \\ &= 2\LR{B_2^TS^TVee^T\:B_2 + B_3^TS^TVee^TB_3}:dD \\ \grad fD &= 2\LR{B_2^TS^TVee^T\:B_2 + B_3^TS^TVee^TB_3} \\ }$$ Setting the gradient to zero and substituting the algebraic expression for $V$ will create a big ugly linear system of equations of the form $$\eqalign{ \sum_j C_jDE_j = F \\ }$$ where the $\{C_j,E_j,F\}$ matrices are various products of the $\{B_k,ee^T,S\}$ matrices.

This system can be solved using vectorization $$\eqalign{ \vc{F} &= \c{\sum_j \LR{E_j^T\otimes C_j}}\vc{D} \;=\; \c{M}\,\vc{D} \\ \vc{D} &= M^{-\o} \vc{F} \qiq D = \unvc{M^{-\o} \vc{F}} \\ }$$