I am trying to understand change-of-basis matrices and orthogonality. Specifically, I understand that vectors are objects which we can manipulate and calculate upon only when we choose a basis. I was trying to figure out the situations in which the change-of-basis matrix will be orthogonal, and also to get some intuition on this. From what I have read, it seems that the change-of-basis matrix will be orthogonal if and only if both bases are themselves orthogonal. I am struggling to understand why this is.
The example I came up with was the vector space $\mathbb{R}^2$ with bases $$ \mathcal{B} = \left\{ \left( \begin{array}{c} 1 \\ 1 \end{array} \right), \left( \begin{array}{c} 1 \\ 0 \end{array}\right) \right\} $$ and $$ \mathcal{C} = \left\{ \left( \begin{array}{c} 1 \\ 0 \end{array} \right), \left( \begin{array}{c} 0 \\ 1 \end{array}\right) \right\} $$ Both are certainly bases for $\mathbb{R}^2$, but $\mathcal{C}$ is orthogonal while $\mathcal{B}$ is not. We have the change of basis formula $$ \left[ x \right]_{\mathcal{B}} = P \left[ x \right]_{\mathcal{C}}$$ where $ \left[ x \right]_{\mathcal{B}}$ and $\left[ x \right]_{\mathcal{C}}$ are the representations of some vector $x$ in the $\mathcal{B}$ and $\mathcal{C}$ bases (coordinate systems), and $P$ is the change-of-basis matrix $$ P = \left[ \begin{array}{cc} 0 & 1 \\ 1 & -1 \end{array}\right]$$ which is easily verified: its columns are the basis vectors in $\mathcal{C}$ expressed in the basis $\mathcal{B}$. However, it is not the case that $P^TP = I$.
Can you explain why the vectors in $\mathcal{C}$ are orthogonal when expressed as they are in $\mathcal{C}$, but not orthogonal when expressed in the basis $\mathcal{B}$? I understood that the vectors themselves do not change, but merely their representation, and I don't understand why their representation would affect their orthogonality. I am worried that I am fundamentally misunderstanding the notion of abstract vectors and basis representations.
As your intuition tells you, the orthogonality of the vectors shouldn’t change when their representation changes. What you haven’t taken into account, however, is that the coordinate-based formula of the inner product that defines orthogonality is basis-dependent. This is examined in this question and its answer and other places.
In general, a coordinate-based formula for the inner product $\langle x,y\rangle$ of two vectors in a real inner product space will have the form $[x]_{\mathcal B}^TQ[y]_{\mathcal B}.$ Here, the notation $[x]_{\mathcal B}$ stands for the coordinates of $x$ relative to the ordered basis $\mathcal B$ and $Q$ is some symmetric positive-definite matrix. Using the standard basis $\mathcal E$, the inner product that you’re using has the formula $[x]_{\mathcal E}^T[y]_{\mathcal E}$: it’s the dot product of $[x]_{\mathcal B}$ and $[y]_{\mathcal B}$. If $P$ is the change-of-basis matrix from $\mathcal B$ to $\mathcal E$, then relative to the latter basis we have $$\langle x,y\rangle = [x]_{\mathcal E}^T[y]_{\mathcal E} = \left(P[x]_{\mathcal B}\right)^T\left(P[y]_{\mathcal B}\right) = [x]_{\mathcal B}^T\left(P^TP\right)[y]_{\mathcal B}.$$ This will only be a simple dot product when $P^TP=I$. Now, $P$’s columns are the coordinates in $\mathcal E$ of the elements of $\mathcal B$ and the entries of $P^TP$ are the pairwise dot product of these coordinate tuples, so this tells us that the standard Euclidean scalar product of two vectors is equal to the dot product of the vectors’ coordinates iff the basis is orthonormal.
Another way to view this is that the matrix $P$ represents an isomorphism $L:\mathbb R^n\to \mathbb R^n$. If we imbue this space with the standard Euclidean inner product then only orthogonal transformations preserve this inner product.
The terminology used in describing these ideas can be a bit confusing. We talk about a basis being orthogonal or orthonormal, and these ideas are in terms of some arbitrary inner product. On the other hand, we call a matrix $A$ orthogonal when $A^TA=I$, but that has to do with dot products of the columns of $A$, that is, of a particular matrix product. This reflects a certain bias toward orthonormal bases: if the columns of $A$ are coordinates of vectors relative to some orthonormal basis, then $A^TA=I$ is equivalent to the vectors represented by its columns being an orthonormal set. As you’ve discovered, though, if you assemble the coordinates of an orthonormal set of vectors into a matrix $A$, you don’t automatically have $A^TA=I$ unless those coordinates are expressed relative to some orthonormal basis. On the other hand, taking the matrix $Q$ from above, it’s always the case that $A^TQA=0$ when the columns of $A$ are orthonormal relative to the inner product $\langle\cdot,\cdot\rangle$.