What are the partial derivatives of the function below?

43 Views Asked by At

I want to compute the gradient of the loss function below for one training example $(t,\mathcal{C_t})$. $w_c$ and $w_t$ are vectors in $\mathbb{R}^d$. The $w_c$'s are not taken from the same matrix as $w_t$.

\begin{equation} L(t,\mathcal{C_t}) = \sum_{c \in \mathcal{C}_t } \log \big(1+e^{-w_c \cdot w_t}\big) + \sum_{c \in \mathcal{C}_t^-} \log \big(1+e^{w_c \cdot w_t}\big) \end{equation}

Here is what I have so far. The partial derivative of $\log \big(1+e^{-w_c \cdot w_t}\big)$ w.r.t. $w_c$ (i.e., holding $w_t$ constant) is:

\begin{equation} \frac{1}{1+e^{-w_c \cdot w_t}} \times -w_t e^{-w_c \cdot w_t} = \frac{-w_t}{e^{w_c \cdot w_t} +1} \end{equation}

And, similarly, its partial derivative w.r.t $w_t$ is $\frac{-w_c}{e^{w_c \cdot w_t} +1}$.

We proceed similarly for the second term. In the end, we have:

$\nabla L_{w_c} = \sum_{c \in \mathcal{C}_t } \frac{-w_t}{e^{w_c \cdot w_t} +1} + \sum_{c \in \mathcal{C}_t^-} \frac{w_t}{e^{-w_c \cdot w_t} +1}$

and:

$\nabla L_{w_t} = \sum_{c \in \mathcal{C}_t } \frac{-w_c}{e^{w_c \cdot w_t} +1} + \sum_{c \in \mathcal{C}_t^-} \frac{w_c}{e^{-w_c \cdot w_t} +1}$

Is that correct?

1

There are 1 best solutions below

5
On BEST ANSWER

Yes you derivation is fine! I end up with the same expressions.