Gradient and Hessian of $f(x,y) := a^T \left( x \odot \left[ \exp\left( \mu \ (y \ \oslash \ x) \right) - 1 \right] \right)$, wr.t. $x$ and $y$

81 Views Asked by At

How to find the Gradient and Hessian of \begin{align} f(x,y) := a^T \left( x \odot \left[ \exp\left( \mu \ (y \ \oslash \ x) \right) - 1 \right] \right) \ , \end{align} where $a, x, y \in \mathbb{R}^n$, all-ones vector $1 \in \mathbb{R}^n$, and $\mu \in \mathbb{R}$? Also, $\odot$ and $\oslash$ means elementwise multiplication and division, respectively.

1

There are 1 best solutions below

1
On BEST ANSWER

Let $$\eqalign{ z &= y \oslash x\\ dz &= (dy \odot x - y \odot dx) \ \oslash \left( x \odot x \right) \\ }$$ and

$$\eqalign{ f &= a^T \left( x \odot \left[ \exp(\mu z) - 1\right] \right)\\ &\equiv a : \left( x \odot \left[ \exp(\mu z) - 1\right] \right) \ , }$$ where for a scalar, trace function will output same scalar, then $\left\langle A, B \right\rangle={\rm tr}(A^TB) = A:B$.

Find the differential and then gradient: $$\eqalign{ df &= \quad a: \left( dx \odot \left[ \exp(\mu z) - 1\right] \right) \\ & \quad + \ a: \left( x \odot \left[ \mu \exp(\mu z) \odot \ dz \right] \right)\\ &= \quad a: \left( dx \odot \left[ \exp(\mu z) - 1\right] \right) \\ & \quad + \ a: \left( x \odot \left[ \mu \exp(\mu z) \odot \ (dy \odot x - y \odot dx) \ \oslash \left( x \odot x \right) \right] \right)\\ }$$

To find $ \frac{\partial f}{\partial y}$, set $dx = 0$ $$\eqalign{ df &= a: \left( x \odot \left[ \mu \exp(\mu z) \odot \ (dy \odot x ) \ \oslash \left( x \odot x \right) \right] \right)\\ &={\color{red}\mu}a : \exp\left( \mu \ y \oslash x \right) \odot dy \\ &={\color{red}\mu}a \odot \exp\left( \mu \ y \oslash x \right) : dy }$$ then, $$\eqalign{ \frac{\partial f}{\partial y} &= {\color{red}\mu}a \odot \exp\left( \mu \ y \oslash x \right) \ . }$$

To find $ \frac{\partial f}{\partial x}$, set $dy = 0$ $$\eqalign{ df &= \quad a: \left( dx \odot \left[ \exp(\mu z) - 1\right] \right) \\ & \quad + \ a: \left( x \odot \left[ \mu \exp(\mu z) \odot \ (- y \odot dx) \ \oslash \left( x \odot x \right) \right] \right)\\ &= \quad a \odot \left( \exp(\mu \ y \oslash x) - 1\right) : dx \\ & \quad - \ \mu \ a \odot \left(y \oslash x \right) \odot \exp(\mu \ y \oslash x): dx \ , }$$

then $$\eqalign{ \frac{\partial f}{\partial x} &= a \odot \left( \exp(\mu \ y \oslash x) - 1\right) - \mu \ a \odot \left(y \oslash x \right) \odot \exp(\mu \ y \oslash x) \ . }$$

EDIT: Incorporated lynn's comment in {\color{red}red}.