Calculate pairwise cosine distance only returning the lower triangular matrix

322 Views Asked by At

I have a matrix, where each row is a feature vector. I would like to find out the pairwise cosine distance between all of these feature vectors. The cosine value between all rows in a matrix could be calculated as

$\cos(M) = M' \cdot M'^T$

Where $M'$ is the row normalized matrix of $M$. This creates a symmetric matrix of pairwise cosine distances, and it is pretty fast to calculate this using the sparse matrix library in scipy. I was wondering if there is an alternative method of only calculating the lower triangular matrix of this, rather than the full symmetric matrix? Hopefully this would be dividing the necessary computational time by two and saving a bit of space.

Thanks

1

There are 1 best solutions below

0
On

Y = pdist(X, 'cosine') does that directly -- see scipy.spatial.distance.pdist .
(spatial.distance seems to be missing in the doc for scipy.spatial in v0.16.0, so hard to find :/ )