Rule chain matrix by vector

183 Views Asked by At

Let $\boldsymbol{X}$ be a $n \times p$ matrix and $\boldsymbol{\beta}$ a $p-$dimensional vector. I'd like to calculate

$$ \frac{\partial f(\boldsymbol{X\beta})}{\partial\boldsymbol{\beta}} $$

I tried

$$ f'(\boldsymbol{X\beta}) \boldsymbol{X} $$

but, obviously, the dimensions are not correct.

2

There are 2 best solutions below

1
On BEST ANSWER

Take an ordinary scalar function $\phi(z)$ and its derivative $\phi'(z)=\frac{d\phi}{dz}$ and apply them element-wise to a vector argument, i.e. $$\eqalign{ v &= X\beta,\quad f &= \phi(v),\quad f' &= \phi'(v) \cr }$$ The differential of such a vector function can be expressed using an elementwise $(\odot)$ product or better yet, a Diagonal matrix $$\eqalign{ df &= f'\odot dv \cr &= {\rm Diag}(f')\,dv \cr &= {\rm Diag}(f')\,X\,d\beta \cr }$$ Given this differential, the gradient with respect to $\beta$ can be identified as the matrix
$$\eqalign{ \frac{\partial f}{\partial \beta} &= {\rm Diag}(f')X \cr\cr }$$ An example of the equivalence of Hadamard product and diagonalization: $$\eqalign{ &a = \pmatrix{a_1\\a_2},\quad &b = \pmatrix{b_1\\b_2},\quad &a&\odot&b = \pmatrix{a_1b_1\\a_2b_2} = b\odot a \cr &A = {\rm Diag}(a) = &\pmatrix{a_1&0\\0&a_2},\quad &&A&b = \pmatrix{a_1b_1\\a_2b_2} \cr &B = {\rm Diag}(b) = &\pmatrix{b_1&0\\0&b_2},\quad &&B&a = \pmatrix{a_1b_1\\a_2b_2} \cr }$$

0
On

You have that, as you wrote

$$\partial[f(X\beta)]=\partial f(X\beta) X$$

for $f:\Bbb R^n\to[0,\infty)$ and $X:\Bbb R^p\to\Bbb R^n$. Then $\partial f(X\beta)$ can be represented by the gradient $\nabla f(X\beta)$, that it is a vector on $\Bbb R^n$ and $\nabla f(X\beta)X$ is a vector on $\Bbb R^p$, that is the gradient of $f\circ X$ in $\beta$, hence

$$\partial f(X\beta) Xh=\nabla f(X\beta)X\cdot h=\nabla(f\circ X)(\beta)\cdot h$$

for any $h\in\Bbb R^p$, where the dot is the euclidean dot product.