Matrix dimensionality after chain-rule with sine function

83 Views Asked by At

I have the equation $\vec{y} = \sin(\textbf{A}\vec{x}+\vec{b})$ where $x,y \in \mathbb{R}^n$ and $ A \in \mathbb{R}^{n\times n}$. If I differentiate with respect to $\vec{x}$, I expect $\frac{d\vec{y}}{d\vec{x}} \in \mathbb{R}^{n\times n}$ (https://en.wikipedia.org/wiki/Matrix_calculus#Vector-by-vector). But, following this through using the chain rule, $$\frac{d\vec{y}}{d\vec{x}} = \cos(\textbf{A}\vec{x}+\vec{b})\frac{d}{d\vec{x}}(\textbf{A}\vec{x}+\vec{b}) = \cos(\textbf{A}\vec{x}+\vec{b})(\frac{d\textbf{A}\vec{x}}{d\vec{x}}+\frac{d\vec{b}}{d\vec{x}}).$$ Since $\frac{d\vec{b}}{d\vec{x}} = 0$, it appears that the two terms in the solution are $ \cos(\textbf{A}\vec{x}+\vec{b}) \in \mathbb{R}^{n}$ and $\frac{d\textbf{A}\vec{x}}{d\vec{x}} \in \mathbb{R}^{n\times n}$. The product of these two matrices don't match the expected dimensions however. Am I missing something?

1

There are 1 best solutions below

0
On BEST ANSWER

Let $$\eqalign{ z &= Ax+b \cr w &= \cos(z) \cr }$$ The only way you can return a vector from the $\sin$ function is if you apply it element-wise. That being the case, you must use the Hadamard (aka element-wise) product, denoted by ($\circ$), in the differentiation.

Instead of the chain rule, let's use differentials to find the gradient. $$\eqalign{ y &= \sin(z) \cr \cr dy &= w\circ dz \cr &= w\circ (A\,dx) \cr &= (w1^T\circ A)\,dx \cr \cr \frac{\partial y}{\partial x} &= w1^T\circ A \cr }$$ where $1$ is the vector of all ones.