I have two matrices $A = \left[ {\begin{array}{*{20}{c}} 3&7&9&1\\ 4&1&2&3\\ 5&6&3&7\\ 2&4&3&7 \end{array}} \right]$ and $B = \left[ {\begin{array}{*{20}{c}} L^3/T^2&0&0&0\\ 0&L^3/T^2&0&0\\ 0&0&L/T&0\\ 0&0&0&1 \end{array}} \right]$.
A is the Frechet derivative matrix. The reason of the B shaping like this is because I am trying to implement non-integer units to time and length.
I am trying to improve the condition number of a matrix A by right product matrix A with matrix B. Therefore, I need to minimize the condition number of AB.
Obviously, condition number of A is a product of norm(AB)*norm((AB)^(-1)).
Therefore, to optimize the condition number of AB with best T and L, I need to get the derivative of norm(AB).
how do I find derivative of norm(AB)?
The Euclidean norm of A is ${\rm{norm}}(A) = \sqrt {{\sigma _{\max }}({A^*}A)} $, $\sigma$ is the eigenvalue, and $A^*$ is the transpose of A.
the Euclidean norm is defined on wiki: https://en.wikipedia.org/wiki/Matrix_norm
Your question is about the gradient of a condition number based on the Euclidean norm. Since I don't know how to do that, here is the gradient based on a Frobenius norm.
Define the scalars $$\eqalign{ \alpha^2 &= \|X\|_F^2 &= X:X \cr \beta^2 &= \|X^{-1}\|_F^2 &= X^{-1}:X^{-1} \cr \phi &= {\alpha}{\beta} \cr }$$ where $\phi$ is the condition number in terms of the Frobenius norm. $$\eqalign{ d\alpha &= \alpha^{-1}X:dX \cr d\beta &= -\beta^{-1}(X^TXX^T)^{-1}:dX \cr }$$ Now we're ready to start differentiating $$\eqalign{ d\phi &= \beta\,d\alpha + \alpha\,d\beta \cr &= \Big(\beta\alpha^{-1}X - \alpha\beta^{-1}(X^TXX^T)^{-1}\Big):dX \cr }$$ In your problem, $X=AB$ and we are interested in finding the gradient wrt $B$. $$\eqalign{ d\phi &= \Big(\beta\alpha^{-1}X - \alpha\beta^{-1}(X^TXX^T)^{-1}\Big):A\,dB \cr &= \Big(\beta\alpha^{-1}A^TX - \alpha\beta^{-1}A^T(X^TXX^T)^{-1}\Big):dB \cr \frac{\partial\phi}{\partial B} &= \beta\alpha^{-1}A^TX - \alpha\beta^{-1}A^T(X^TXX^T)^{-1} \cr &= \beta\alpha^{-1}A^TAB - \alpha\beta^{-1}A^T(B^TA^TABB^TA^T)^{-1} \cr &= \beta\alpha^{-1}A^TAB - \alpha\beta^{-1}(B^TA^TABB^T)^{-1} \cr \cr }$$ NB: In some of the steps above, a colon was used as a convenient product notation for the trace function, i.e. $$P:Q = {\rm Tr}(P^TQ)$$