Can anyone help me to derive the partial derivative of this optimization problem with respect to $U_u$?
$$ L(R,T,U,V) = \frac{1}{2}\sum_{u=1}^{N}\sum_{i=1}^M I_{u,i}^R(R_{u,i} - g(U_u^TV_i))^2 + \frac{\lambda_U}{2} \sum_{u=1}^{N}U_u^TU_u + \frac{\lambda_V}{2} \sum_{i=1}^{M}V_i^TV_i + \frac{\lambda_T}{2} \sum_{u=1}^{N}(U_u - \sum_{v \in N_u} T_{u,v}U_v)^T(U_u - \sum_{v \in N_u} T_{u,v}U_v)$$
I want to point out that this is not a homework, I am reading this paper, They have derived the partial derivative of this problem with respect to both $U_u$ and $V_i$, I have no Idea how they have done this, It would be really nice someone tell me how should I tackle such problems, how should I approach such functions.
$U_u$ and $V_i$ are $k$ dimensional vectors, $I_{u,i}^R$ is some identity function $\in \{0,1\}$, $N_u$ is set of neighbors of $u$ in a social network.
From the paper, they use $g(x)$ to denote the logistic function. The most useful form for its derivative is probably $$\eqalign{ \frac{dg}{dx} &= (g-g^2) \implies dg &= (g-g^2)\,dx \cr\cr }$$ Define a few matrix variables $$\eqalign{ X &= U^TV \implies dX = dU^TV + U^TdV \cr G &= g(X) \implies dG = (G-G\odot G)\odot dX \cr M &= I-T \cr J &= I\odot(G-R)\odot(G-G\odot G) \cr }$$ Note that $G$ comes from an element-wise evaluation of the logistic function. When dealing with element-wise functions of matrices, the elementwise/Hadamard product $(\odot)$ is necessary.
Next, the explicit summations can be eliminated in favor of the inner/Frobenius product, which is a convenient infix notation for the trace $$A:B={\rm tr}(A^TB)$$ Putting it all together, we can write the objective function in matrix form, and then find its differential and gradient $$\eqalign{ L &= \frac{1}{2}\Big[I\odot(G-R):(G-R) + \lambda_UU:U + \lambda_VV:V + \lambda_TMU:MU \Big] \cr\cr dL &= I\odot(G-R):dG + \lambda_UU:dU + \lambda_VV:dV + \lambda_TMU:M\,dU \cr &= I\odot(G-R):(G-G\odot G)\odot d(U^TV) + \lambda_UU:dU + \lambda_VV:dV + \lambda_TMU:M\,dU \cr &= J:(dU^TV+U^TdV) + \lambda_UU:dU + \lambda_VV:dV + \lambda_TM^TMU:dU \cr &= (UJ+\lambda_VV):dV + (VJ^T+\lambda_UU+\lambda_TM^TMU):dU \cr\cr \frac{\partial L}{\partial V} &= UJ+\lambda_VV,\,\,\,\,\,\,\,\frac{\partial L}{\partial U} = VJ^T+\lambda_UU+\lambda_TM^TMU \cr\cr }$$ If you're not used to working with the Hadamard and Frobenius products, this post may be hard to follow. So here is a summary of the rules for rearranging terms, which follow from the properties of the trace $$\eqalign{ A:BC &= B^TA:C \cr &= AC^T:B \cr &= A^T:(BC)^T \cr &= BC:A \cr\cr A\odot B &= B\odot A \cr\cr A\odot B:C &= A:B\odot C \cr }$$