Is there a rigorous definition for matrix derivatives?

214 Views Asked by At

I know that,

A function $f: \mathbb{R}^n \to \mathbb{R}$ is said to be differentiable at $x$ if there exists a vector $v$ such that, $$ \lim_{h \to 0} \frac{f(x+h) - f(x) - v^Th} {\|h\|} = 0. $$ When $v$ exists, it is given by the "gradient" $\nabla f(x) = \left(\frac{\partial f}{\partial x_1}, ..., \frac{\partial f}{\partial x_n}\right)(x)$

Does there exist a similar definition for "matrix derivative"

https://en.wikipedia.org/wiki/Matrix_calculus#Derivatives_with_matrices

2

There are 2 best solutions below

0
On BEST ANSWER

Matrix derivation is just a particular case of Fréchet derivative between two Banach spaces. Which by the way is very similar in term of definition to the definition of the derivative of a function $f : \mathbb R^n \to \mathbb R$ provided in the question.

Applied to matrix derivatives, you just have to consider a map $f : V \to M$ where $M$ is a linear space of matrices endowed with the norm of your choice and $V$ a Banach space that can be (or not) of finite dimension.

0
On

Yes, you can use a very similar definition. First of all, the map $h \mapsto v^\top h$ encodes an arbitrary linear map on $\mathbb R^n$. For matrices, you can substitute it by the Frobenius inner product, e.g., $$A \mapsto (A,B)_F := \sum_{i,j = 1}^n A_{ij} B_{ij},$$ where $B \in \mathbb R^{n \times n}$ is fixed.

Thus, for $f \colon \mathbb R^{n \times n} \to \mathbb R$ you can define $\nabla f(A)$ to be the (unique) matrix $B \in \mathbb R^{n \times n}$ (if it exists), that satisfies $$ \lim_{H \to 0} \frac{f(A + H) - f(A) - (B,H)_F}{\|H\|} = 0.$$

This definition directly extends to Hilbert spaces (in which you use the inner product determined by the Hilbert space structure). In Banach spaces, one uses the duality product and this yields a derivative which belongs to the dual space.