I'm trying to follow along with an online class that's using matrix vectorization for image comparison. Although I have the solution, I don't understand one of the matrix transformations. This is probably basic linear algebra, but I'm decades removed from my last math class. The problem is roughly:
Given a matrix A that is
500x7000, and a matrix B that is5000x7000, we want to take the Euclidean distance for each row of A with respect to each row of B.The result should be a distance matrix D, with shape
500x5000.
The point of the exercise is to learn how to do vectorized comparisons (using numpy), vs the obvious programmatic approach of looping over A, then looping over B to take their differences. The tricky part here is that A and B have a different number of rows.
We're given hints to expand the formula for Euclidean distance, i.e. $$\sqrt{\sum{(A-B)^2}}$$ This expands to $$\sqrt{\sum{A^2 + B^2 -2AB}}$$
The first two terms are manageable- element-wise squaring and summation. But for the third term, we need to multiply A by B, and since they have different row counts, some manipulation is needed. The solution is to use: $$-2A \cdot B^T$$
That is, the dot product of A with B transposed (maybe I got the formatting wrong here?) This works, and I get the expected answer. But I'm not satisfied that the math is "correct".
My question: if we are transposing B, doesn't that mean that the "wrong" numbers are multiplied with A? I understand that a transpose essentially flips the numbers around its axis, so this seems like an artificial way to simply make the rows/cols line up conveniently. Obviously this works, but it seems wrong to me. Help me understand why this works, mathematically.