We have to calculate the gradient for:
$\sum_{i=1}^n$ log$(1+exp(-y_i* \textbf{w}^T x_i)) + \frac{1}{b} \sum_{i=1}^d w_i^4$
and to write down pseudo-code for stochastic gradient descent of this function with respect to $\textbf{w}$.
Here my solutions - I would appreciate some feedback to know if these are correct:
- Gradient:
$\sum_{i=1}^n$ $\frac{-y_i*x_i*exp(-y_i*\textbf{w}^T*x_i)}{(1+exp(-y_i* \textbf{w}^T x_i)} + \frac{4}{b}\sum_{i=1}^d {w_i}^3 $
- Stochastic gradient descent pseudocode:
(1) Random initialization of weight vector w and choice of stepsize $\eta$
(2) For data samples i,...,n determine the next weight-vector with dimension L iteratively as follows:
$(...,w_l,...)^T = (...,w_l,...)^T - \eta ([...,\frac{-y_i*x_{il}*exp(-y_i*{w}_l*x_{il})}{(1+exp(-y_i* {w}_l x_{il})} + \frac{4}{b}\sum w_i^3,...)$