Gradient of dual SVM formulation

586 Views Asked by At

A dual formulation of an SVM can be given as

$$D(a) = \sum_{n=1}^{N} a_n - \frac{1}{2} \sum_{n=1}^{N}\sum_{m=1}^{N}a_na_mt_nt_mk(x_n, x_m)$$

I'm not sure about how to get the gradient $\nabla D(a)$.

I've already had a look at Differentiation of a double summation, from which I derived the gradient as $$\nabla_{a_p} D(a) = 1 - t_p \sum_{n=1}^N a_nt_nk(x_n, x_p)$$

Am I right? If so, it's not clear to me what $t_p$ stands for, since there's no iterator $p$ for a summation or anything.

EDIT:

My trivial problem was understanding that $\nabla_{a_p}$ is the $p$-th entry of my gradient vector. Also the matrix representation of the answer below helps a lot.

1

There are 1 best solutions below

0
On BEST ANSWER

From the link that you shared, by changing a little notation to avoid clash of notation in your current question, if $$\alpha = \sum_{n=1}^N \sum_{m=1}^N b_{nm}x_nx_m$$

then we have $$\frac{\partial \alpha}{\partial x_p}=\sum_{m=1}^N b_{pm}x_m + \sum_{n=1}^N b_{np}x_n $$

For our context, we can view $b_{nm}=t_nt_mk(x_n, x_m)$,

\begin{align}\frac{\partial D(a)}{\partial a_p}&=1-\frac12 \left( \sum_{m=1}^N b_{pm}a_m + \sum_{n=1}^N b_{np}a_n \right)\\ &= 1-\frac12 \left( \sum_{m=1}^N t_pt_mk(x_p, x_m)a_m + \sum_{n=1}^N t_{n}t_pk(x_n, x_p)a_n \right) \\ &=1-t_p\sum_{n=1}^Nt_{n}k(x_n, x_p)a_n\end{align}

$p$ is the index of the variable. $t_p$ is the $p$-th entry of the vector $t$. It is usually the label, which takes value $1$ or $-1$.

$$\nabla_a D(a) = e-Ka$$

where $e$ is the all-one vector and $K$ is the matrix where $K_{ij} = t_nt_mk(x_n, x_m)$.