I don't have much linear algebra background. I'm taking this online course on machine learning where the professor mentioned that the the inner product of two one hot vectors is zero (assuming the two one hot vectors are encoding different things). He also said that the Euclidean distance between a pair of (different) one hot vectors is also the same as the distance between any other pair of (different) one hot vectors.
I've looked around online a bit, but I'm sort of overwhelmed by the math used to describe the two concepts and I'm not sure I'm looking at the right math. Can somebody explain in simple terms why these things would be the case?
In case it's unclear, a one hot vector is a vector of all zeros except for a single entry where there is a one, so for example [0 0 0 1 0].
For a vector $x$, I'll write $x_i$ for the $i$th component, for instance if $x = (2,3,5)$, then $x_1 = 2$ and $x_3 = 5$. A quick review of Euclidean distance: the Euclidean distance between vectors $x=(x_1, \dotsc, x_n)$ and $y=(y_1, \dotsc, y_n)$ is given by $$d(x,y) = \sqrt{\sum_{i=1}^n (x_i - y_i)^2}.$$ This is a generalisation of Pythagoras' theorem. Using the same notation, we can write the dot product $$x \cdot y = \sum_{i=1}^n x_i y_i.$$
Now, think about one-hot vectors. In that case, $x_i = 0$ for nearly all of the possible $i$s, except for one, which is $1$. Using the formulae above, can you see why the given statements should hold?