How is the gradient of a scalar wrt column vector different from the gradient of a scalar wrt to a row vector?
Is gradient of a scalar wrt a column vector is a column/row vector?
How is the gradient of a scalar wrt column vector different from the gradient of a scalar wrt to a row vector?
Is gradient of a scalar wrt a column vector is a column/row vector?
Copyright © 2021 JogjaFile Inc.
The gradient of a scalar valued function with respect to a vector may be defined in multiple ways, it depends on the layout convetion you follow:
Numerator layout:
$$\frac {\partial y}{\partial \mathbf {x} } = \begin{bmatrix}{\frac {\partial y}{\partial x_{1}}}&{\frac {\partial y}{\partial x_{2}}}&\cdots &{\frac {\partial y}{\partial x_{n}}}\end{bmatrix}$$
Denominator layout:
$$ \frac {\partial y}{\partial \mathbf {x}} = \begin{bmatrix}{\frac {\partial y}{\partial x_{1}}}\\{\frac {\partial y}{\partial x_{2}}}\\\vdots \\{\frac {\partial y}{\partial x_{n}}}\\\end{bmatrix} $$
Or mixed layout:
$$\frac{\partial y}{\partial \mathbf{x}'} = \begin{bmatrix}{\frac {\partial y}{\partial x_{1}}}&{\frac {\partial y}{\partial x_{2}}}&\cdots &{\frac {\partial y}{\partial x_{n}}}\end{bmatrix}$$