Find the gradient of the following function

150 Views Asked by At

The function is:

$\frac{1}{n}\Sigma^n_{i=1}E(y_i-w^Tx_i)+\frac{\theta}{2}||w||^2_2$ where $E(k)={\{^{\frac{1}{2}k^2,\space\space\space\space |k|<1}_{|k|-\frac{1}{2}\space\space|k|\geq1}}$

What is the gradient of this function where w is the weight vector.

My work:

If $|k|<1:$

The rows of the gradient look like this $\triangledown F(w)_{j,1}=\frac{1}{n}\Sigma(yi-w^Tx_i)(-\frac{\partial w^T}{\partial w_j}x_i)+\theta w$

On each row of the gradient I need to take partial derivative with $w_i$s I think since w is a vector. But how do I take the derivative of $\frac{\partial w^T}{\partial w_j}$? Also I don't understand how should I take the derivative of the vector when everything is in absolute value either. Thanks in advance.

1

There are 1 best solutions below

2
On BEST ANSWER

Note that $E(w)$ is the Huber function. The Huber function, as it is here, is the Moreau-envelope of the absolute value with smoothing parameter equal to 1. The gradient of the Moreau-envelope can be expressed as the prox of the convex conjugate function, which can be further expressed using the prox of the original function by Moreau's decomposition theorem. So we have,

$$\nabla E(w) = prox_{|\cdot|^*}(w) = w-prox_{|\cdot|}(w) = w-soft(w)$$

where $soft(w)$ is the soft threshholding operator with parameter $1$.

Thus,

$$\nabla F(w) = \nabla \left( \frac{1}{n}\sum^n_{i=1}E(y_i-w^Tx_i)+\frac{\theta}{2}||w||^2_2\right) = \frac{1}{n}\sum^n_{i=1}\nabla E(y_i-w^Tx_i)+\theta w\\ = \frac{1}{n}\sum\limits_{i=1}^n(w^Tx_i)\left(w-soft(y_i-w^Tx_i)\right) + \theta w$$