Working on a project using
https://www.cs.toronto.edu/~amnih/papers/bpmf.pdf
The full bayesian treatment of probabilistic matrix factorization.
I need to dive into the weeds of some of this work, but in order to do this I need to have a clearer understanding of the following element.
$N(R_{ij}| U_i^T V_j, \alpha^{-1})$
This element appears in the paper above in equation 11, as the likelihood leading to the full posterior distribution of $p(U_i|u_i^*, \Lambda_i^*)$
Meaning that I need to turn the single variate likelihood of $R_{ij}$ into a multivariate normal over the column $U_i$ to do full posterior inference....ok...
So I begin (dropping the outside variance terms)
$e^{\frac{-1}{2\alpha^2}(R_{ij}-U_i^TV_j)^2}$
But now I want to get the above expression in terms of strictly the column vector $U_i$. It is not so clear to me how to do this. I have begun with expansion of terms, followed by an attempt to turn the item above into a quadratic form. I have also tried simple matrix multiplication by $AA^{-1}$ in order to get some suitable result. The math is not in the paper, but they claim that the full posterior has mean and covariance which appears in equation 12 and 13 for reference, (this might also provide intuition as to what the full multivariate normal should be simply over the likelihood).
An aside question, in equation 13: the authors write the posterior mean $\mu$ to be
$ \mu_i^* = \Lambda^{-1}(\alpha \sum [V_jR_ij]^{I_{ij}}+ \Lambda_U \mu_u) $
Where $I$ is essentially an indicator matrix. In this case we are raising a vector to the power of 0, where the product is not even naturally defined. Strange notation. I have been treating this as an element wise power raise, but even this seems strange to me. From my intuition of the posterior as is I have some suspicion that the authors mean to 0 this element when the indicator does not hit.
Anyhow if any of you stars know how to handle this, it would be greatly appreciated.