I am doing Information Retrieval using Cosine Similarity.
My data is a binary vector.
Since most of the references I read were using non-binary vector (non-binary matrix) data, I am wondering if it is wrong to use binary vector data in the cosine similarity function.
Using binary vector data works perfectly for doing cosine similarity studies. Actually, it makes the arithmetic much simpler because the magnitude of each vector is simply equal to the squareroot of the sum of its entries.