Given a pair of strings in vector form $(s_i,s_j)$, I can find cosine similarity of pairs as follows:
$cosine(s_i,s_j)=s_i.*s_j / (\|s_i\|\|s_j\|)$
Similarly, bilinear similarity is defined as:
$BilinearSimilarity(s_i,s_j) = s_i^TWs_j $
I want to know what property of strings does the matrix $W$ captures intuitively? Why Should I prefer Bilinear similarity compared to cosine similarity?
Lets say your vectors $s_i \in \mathbb{R}^2$. Imagine that the only first component of your vector is meaningful but the second component is noise. When you compare two vectors $s_i$ and $s_j$, you should be comparing only their first components. This can be captured through bilinear similarity using $W = \begin{bmatrix}1 & 0\\0&0\end{bmatrix}$, because $$s_i^T W s_j = s_{i1}W_{11}s_{j1} = s_{i1}s_{j1}.$$
Next, consider a case where similarity along the first co-ordinate should be weighted more than similarity along the second co-ordinate. This is captured using $W = \begin{bmatrix}2 & 0\\0&1\end{bmatrix}$.
If you ignore the normalization, cosine similarity is just bilinear similarity with $W = \begin{bmatrix}1 & 0\\0&1\end{bmatrix}$.
Bilinear similarity is a generalization of cosine similarity where not all features are treated equal.