I was reading this paper
- Bell RM, Koren Y. Scalable collaborative filtering with jointly derived neighborhood interpolation weights. Proc - IEEE Int Conf Data Mining, ICDM. 2007:43-52. doi:10.1109/ICDM.2007.90.
in equation (3) of the paper, the object function read
$obj=\sum_\limits{u}\sum_\limits{i}c_{ui}(p_{ui}-x^T_uy_i)^2+\lambda(\sum_\limits{u}||x_u||^2+\sum_\limits{i}||y_i||^2)$
where $c_{ui}, p_{ui}$ is scalar, $x_u and \space y_i$ are column vector that are the same dimension f, $u=1,2,...m$ and $i=1,2,3,...n$
$x_u=[x_{u1},x_{u2},..x_{uf}]$ and $y_i=[y_{i1},y_{i2},..y_{if}]$
$||x_u||=\sum_\limits{k=1}^\limits{f}{x_{uk}^2}$ and so for
$||y_i||=\sum_\limits{k=1}^\limits{f}{y_{ik}^2}$
now keeping $y_i$ fixed, differentiate obj with respect to $x_u$
$\frac{\partial{obj}}{\partial{x_u}}=0$
implies
$[\sum_\limits{u}\sum_\limits{i}{2c_{ui}(p_{ui}-x^T_uy_i)(-y_i)}]+2{\lambda}x_u =0$ which implies that $[\sum_\limits{u}\sum_\limits{i}{{-c_{ui}p_{ui}y_i+c_{ui}(x^T_uy_i)y_i}]+{\lambda}x_u }=0$
how to solve for $x_u$ then? since the two parts are $c_{ui}(x^T_uy_i)y_i$ and ${\lambda}x_u$ are with factor $x_u^T$ and $x_u$ separately.
define $X=[x_1,x_2,...x_m]_(n\times f)$ and $Y=[y_1,y_2,...y_n]_(n\times f)$, and solve for $x_u$ with $Y$
define $C^u$ to be a diagonal matrix of (n\times n) such that $C^u_{ii}=c_{ui}$ and $p(u)=[p_{u1},p_{ui},...p_{un}]^T$
The solution the paper gives are $x_u=(Y^TC^uY+\lambda I)^{-1}Y^TC^up(u)$ but anyone could explain why?