Partial derivative in gradient descent for social recommendations

187 Views Asked by At

In paper entitled with "recommendations in signed social networks" by Jiliang Tang, he suggest model for capturing local and global information from signed social networks as follows:

$$min \sum_{i=1}^n \sum_{j=1}^m g(W_{ij},w_i) ||R_{ij}-U_i^T V_j||_2^2 +\alpha (||U||_F^2 + ||V||_F^2) +\\ \beta \sum_{i=1}^n max (0, ||U_i - \bar U_i^p ||_2^2 - ||U_i - \bar U_i^n ||_2^2 ) \;\;\;\; (1)$$

where:

$$\bar U_i^p= \frac {\sum_{{u_j \in P_i}} S_{ij} U_j } {\sum_{{u_j \in P_i}} S_{ij} }$$

$$\bar U_i^n= \frac {\sum_{{u_j \in N_i}} S_{ij} U_j } {\sum_{{u_j \in N_i}} S_{ij} }$$

And because Eq. (1) is jointly convex with respect to U and V, there is no nice solution in closed form due to the use of the max function. So he use Mki at the k-th iteration for ui as follows:

$$ M_i^K= \{ 1 \;\;\;\;\;\; if ||U_i - \bar U_i^p ||_2^2 - ||U_i - \bar U_i^n ||_2^2>0 \\ 0 \;\;\;\;\;\; otherwise$$

Then he use J to denote the objective function of Eq. (1) in the k-th iteration as follows:

$$J= \sum_{i=1}^n \sum_{j=1}^m g(W_{ij},w_i) ||R_{ij}-U_i^T V_j||_2^2 +\alpha (||U||_F^2 + ||V||_F^2) +\\ \beta \sum_{i=1}^n M_i^K ( ||U_i - \frac {\sum_{{u_j \in P_i}} S_{ij} U_j } {\sum_{{u_j \in P_i}} S_{ij} } ||_2^2 - ||U_i - \frac {\sum_{{u_j \in N_i}} S_{ij} U_j } {\sum_{{u_j \in N_i}} S_{ij} } ||_2^2 ) \;\;\;\; (2)$$

He compute The derivatives of J with respect to Ui and Vj as follows:

$$ \frac {\partial J}{ \partial U_i}= -2 \sum_{j} g(W_{ij},w_i) (R_{ij}-U_i^T V_j) V_j + 2 \alpha U_i \\ +2 \beta M_i^k (U_i - \bar U_i^p ) -2 \beta M_i^k (U_i - \bar U_i^n )\\ -2\beta \sum_{{u_j \in P_i}} M_j^k (U_j - \bar U_j^p ) \frac {1}{\sum_{{u_j \in P_i}} S_{ji}} \\ +2\beta \sum_{{u_j \in N_i}} M_j^k (U_j - \bar U_j^n ) \frac {1}{\sum_{{u_j \in N_i}} S_{ji}} \\ \\$$

$$ \frac {\partial J}{ \partial V_j}= -2 \sum_{j} g(W_{ij},w_i) (R_{ij}-U_i^T V_j)U_i+ 2 \alpha V_j $$

I do not understand how the partial derivative with respect to Ui was done. I would very much like to understand this if possible. Could someone show how the partial derivative could be taken step by step, or link to some resource that I could use to learn more? I apologize if I haven't used the correct terminology in my question.