solving a collaborative filtering problem

114 Views Asked by At

I was reading this paper

  1. Bell RM, Koren Y. Scalable collaborative filtering with jointly derived neighborhood interpolation weights. Proc - IEEE Int Conf Data Mining, ICDM. 2007:43-52. doi:10.1109/ICDM.2007.90.

in equation (3) of the paper, the object function read

$obj=\sum_\limits{u}\sum_\limits{i}c_{ui}(p_{ui}-x^T_uy_i)^2+\lambda(\sum_\limits{u}||x_u||^2+\sum_\limits{i}||y_i||^2)$

where $c_{ui}, p_{ui}$ is scalar, $x_u and \space y_i$ are column vector that are the same dimension f, $u=1,2,...m$ and $i=1,2,3,...n$

$x_u=[x_{u1},x_{u2},..x_{uf}]$ and $y_i=[y_{i1},y_{i2},..y_{if}]$

$||x_u||=\sum_\limits{k=1}^\limits{f}{x_{uk}^2}$ and so for

$||y_i||=\sum_\limits{k=1}^\limits{f}{y_{ik}^2}$

now keeping $y_i$ fixed, differentiate obj with respect to $x_u$

$\frac{\partial{obj}}{\partial{x_u}}=0$

implies

$[\sum_\limits{u}\sum_\limits{i}{2c_{ui}(p_{ui}-x^T_uy_i)(-y_i)}]+2{\lambda}x_u =0$ which implies that $[\sum_\limits{u}\sum_\limits{i}{{-c_{ui}p_{ui}y_i+c_{ui}(x^T_uy_i)y_i}]+{\lambda}x_u }=0$

how to solve for $x_u$ then? since the two parts are $c_{ui}(x^T_uy_i)y_i$ and ${\lambda}x_u$ are with factor $x_u^T$ and $x_u$ separately.

define $X=[x_1,x_2,...x_m]_(n\times f)$ and $Y=[y_1,y_2,...y_n]_(n\times f)$, and solve for $x_u$ with $Y$

define $C^u$ to be a diagonal matrix of (n\times n) such that $C^u_{ii}=c_{ui}$ and $p(u)=[p_{u1},p_{ui},...p_{un}]^T$

The solution the paper gives are $x_u=(Y^TC^uY+\lambda I)^{-1}Y^TC^up(u)$ but anyone could explain why?