Sigmoid Function Involving Vectors When Performing Network Embedding

289 Views Asked by At

I'm currently trying to understand how the graph embedding algorithm described in "LINE: Large-scale Information Network Embedding" works. In order to generate an embedding that represents a large scale network a probability is computed that is later used in an objective function that is optimized. The computation for the probability is described as:

$$p_1(v_i, v_j) = {1 \over 1 + exp(-\overrightarrow{u}_i^T . \overrightarrow{u}_j)}.$$

Where $v_i$ is the $i^{th}$ node in the graph and $\overrightarrow{u}_i \in \mathbb{R}^d$ is the low-dimensional (of size d) vector representation of vertex $v_i$.

From my understanding both $\overrightarrow{u}_i$ and $\overrightarrow{u}_j$ are initially set to random values and then a objective function involving the above mentioned probability is minimized.

What I dont understand is how can a dot product be taken between a vector with dimensionality d and the transpose of another vector with dimensionality d? By taking the transpose wouldn't you create a mismatch in shape between the vectors? Making them incompatible for dot product computation?

For example, you have 2 1xn vectors. Now you take the transpose of 1 and have a nx1 vector.