Change of basis and diagonalization

Question

Change of basis and diagonalization

237 Views Asked by Bumbble Comm At 01 Apr 2026 - 2:00

For my notation here I'm going to use $'$ to indicate a new basis. Let $U$ be a unitary operator that takes a basis vector $|i\rangle$ and transforms it to a new basis vector $|i'\rangle$.

So let's say I know $A$ in a basis $|i\rangle$ and I want to find $A'$ ($A$ in the new basis).

If $|V'\rangle$ is any vector in our new basis we can we find $|V\rangle$ in the old basis by taking the adjoint of $U$ and applying it to $|V'\rangle$. Thus $|V\rangle=U^{-1}|V'\rangle$. Now that we have $|V\rangle$ in the original basis we can apply $A$ to it: $A|V\rangle=A U^{-1}|V'\rangle$. This gives a vector in the original basis, but we want a vector in our new basis so we apply $U$ to it.

This leads to the conclusion that $A'=UAU^{-1}$.

Unfortunately, this seems to disagree with my quantum mechanics textbook which claims that $A'=U^{-1}AU$. Although, I'm solving the more general problem of change of basis, and the text (Shankar) is solving only diagonalization. He does this without proof, although I wrote a short proof of this fact on my own, but it obviously contradicts the above derivation so I don't know how to reconcile the two.

I should also mention that Shankar is assuming A is Hermitian, but that should be irrelevant in the more general case of change of basis.

So why is my answer different than his? It should be the same in both cases right, because diagonalization is this exact problem but with the change of basis being to eigenvectors?

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Accepted Answer

There are a few things to note. Shankar is implementing an active transformation, which is an operator that takes in a basis vector and spits out a basis vector in the new basis. This is an operation that changes the vector, and so it cannot be interpreted as a change-of-basis operation, which takes the same vector and just represents it in the new basis (this is called a passive transformation). I think that you are mixing up active and passive transformations, and, frankly, I think maybe Shankar is too in this section of the book. Below I outline the passive transformation and then re-interpret things in terms of an active transformation.

Passive transformations: representing vectors in different bases

In my opinion, Dirac notation for vectors shouldn't be used to denote a column vector, and Shankar I think plays a little bit fast and loose with the notation by equating a ket with a column vector. The ket represents the abstract vector, and the column vector is a representation of this vector in a basis. Another part of the problem is that, in my opinion, change-of-basis operations should not be interpreted as operators on the space because a change-of-basis doesn't change the vector but rather the vector's representation.

Here's how I would go about things. Start with a basis $\lvert i\rangle$. We can construct a change-of-basis matrix by first inserting a do-nothing operator (i.e., a resolution of the identity) as $$ \lvert i \rangle =\sum_j \lvert j' \rangle \langle j' \vert i \rangle\,. $$ Forming the matrix $$ M_{p\leftarrow up} = \begin{bmatrix} \langle 1' \vert 1 \rangle & \langle 1' \vert 2 \rangle & \langle 1' \vert 3 \rangle & \cdots & \langle 1' \vert N \rangle\\ \langle 2' \vert 1 \rangle & \langle 2' \vert 2 \rangle & \langle 2' \vert 3 \rangle & \cdots & \langle 2' \vert N \rangle\\ \vdots & \vdots & \vdots & & \vdots\\ \langle N' \vert 1 \rangle & \langle N' \vert 2 \rangle & \langle N' \vert 3 \rangle & \cdots & \langle N' \vert N \rangle \end{bmatrix}\,, $$ we can see that if we act on the column vector $[1,0,\dots,0]^T$, we get exactly the first column of the matrix. This first column is exactly the representation of the vector $\lvert 1 \rangle$ in the primed basis. Thus, the original column vector $[1,0,\dots,0]^T$ must be the representation of $\lvert 1 \rangle$ in the un-primed basis. Hence, this matrix operates on vectors represented in the unprimed basis and makes vectors represented in the primed basis. This is why I've labeled it with subscripts as shown: $p$ stands for primed and $up$ stands for un-primed. Let's call it $M$ for short.$^1$

Now, consider an operator $\hat{A}$. To represent it in the original basis, we insert two do-nothing operators on either side of $\hat{A}$, yielding $$ \hat{A} = \left(\sum_n \lvert n \rangle \langle n \rvert \right) \hat{A} \left(\sum_m \lvert m \rangle \langle m \rvert \right) = \sum_{nm} \langle n \rvert\hat{A}\lvert m \rangle \lvert n \rangle \langle m \rvert\,, $$ which allows us to recognize $\langle n \rvert\hat{A}\lvert m \rangle$ as the matrix elements of $\hat{A}$ in this basis. To effect the change of basis to the primed basis, we again insert two do-nothings, i.e. \begin{align} \hat{A} &= \sum_{nm} \langle n \rvert\hat{A}\lvert m \rangle \lvert n \rangle \langle m \rvert = \left(\sum_{i} \lvert i' \rangle \langle i' \rvert \right) \sum_{nm} \langle n \rvert\hat{A}\lvert m \rangle \lvert {n} \rangle \langle m \rvert \left(\sum_{j} \lvert j'\rangle \langle j' \rvert \right) \\ &= \sum_{ij} \lvert i' \rangle\langle j' \rvert \left(\sum_{nm} \langle i' \vert {n} \rangle \langle n \rvert\hat{A}\lvert m \rangle \langle m \vert j' \rangle \right)\,. \end{align} The sum inside the parentheses can be interpreted as the matrix multiplication of $$ M_{p\leftarrow up} A_{\textrm{in up basis}} M^{\dagger}_{p\leftarrow up} = M_{p\leftarrow up} A_{\textrm{in up basis}} M_{up\leftarrow p}\,, $$ which is then the matrix representation $A_{\textrm{in p basis}}$ of the operator $\hat{A}$ in the primed basis.

Now, how does this relate to diagonalization? Suppose that the primed basis is the basis that diagonalizes the operator $\hat{A}$. Then, the matrix that takes a vector represented in the old basis and represents it in the basis that diagonalizes $\hat{A}$ is exactly $M_{p\leftarrow up}$, which, as above, has as its column representations of the old basis vectors in the new basis. This means that the matrix $M^{\dagger}_{p\leftarrow up} = M_{up\leftarrow p}$ has as its columns the eigenvectors of $\hat{A}$ represented in the old basis. It's a little hard to tell what he's saying, but in Shankar, he says that the columns of $U$ "contain the components of the eigenvectors." In other words, his $U$ is exactly $M_{up\leftarrow p}$ that I've just defined$^2$, and in that case, the transformation that's being done is $$ M_{p\leftarrow up} A_{\textrm{in up basis}} M^{\dagger}_{p\leftarrow up} = U^{\dagger} A_{\textrm{in up basis}} U\,, $$ which is exactly what Shankar says it should be. And you can certainly check with examples that this is the correct matrix transformation that takes a linear operator represented in some basis and represents it in the basis that diagonalizes the linear operator.

_{$^1$ At this point, we might identify the matrix $M$ as $U$, but as I show later in this section, we must be careful about that identification!}

_{$^2$ Except, crucially, my $U$ is a matrix, not an operator acting on the original (abstract) vector space.}

Active transformations and diagonalization

The issue with the passive transformation picture is that you have to keep track of which basis is being used to represent which vector. The problem with the active transformation picture is that the operators and vectors change under the transformation, so it's harder to keep track of what's really going on. Nonetheless, let's try it. Consider the (active) operator $\hat{U}$, defined by $$ \hat{U} \lvert i\rangle = \lvert i' \rangle\,, $$ so that it is also true that $$ \hat{U}^{\dagger} \lvert i'\rangle = \lvert i \rangle\,. $$ The matrix representation of this operator in the un-primed basis can be derived via \begin{align} \hat{U} = \left(\sum_n \lvert n \rangle \langle n \rvert \right) \hat{U} \left(\sum_m \lvert m \rangle \langle m \rvert \right) = \sum_{nm} \lvert n \rangle \langle n \rvert \hat{U} \lvert m \rangle \langle m \rvert = \sum_{nm} \lvert n \rangle \langle n \vert m' \rangle \langle m \rvert \,, \end{align} The matrix elements of $U$ in the old basis are therefore $\langle n \vert m' \rangle$, which means that the matrix representation of $\hat{U}$ is identical to $M_{p\leftarrow up}^{\dagger}$ (the complex transpose of the matrix written out above). In other words, it is the matrix whose columns are the new basis vectors represented in the old basis. (Again, this is what Shankar says in the section quoted above.) Note however that this is a different thing then above, because here we are representing an operator in one or the other basis, and above, we are making a change-of-basis matrix. They just happen to be the same (for good reason).

Now, suppose we have an operator $\hat{A}$ diagonalized by the primed-basis, i.e., $\hat{A}\lvert m'\rangle = a_m\lvert m'\rangle$. What does the operator $\hat{U}^{\dagger}\hat{A}\hat{U}$ do to the state $\lvert m \rangle$? The following: $$ \hat{U}^{\dagger}\hat{A}\hat{U}\lvert m \rangle = \hat{U}^{\dagger}\hat{A}\lvert m' \rangle = \hat{U}^{\dagger}a_m\lvert m' \rangle = a_m\hat{U}^{\dagger}\lvert m' \rangle =a_m\lvert m \rangle\,. $$ In other words, if $\lvert m'\rangle$ is an eigenvector of $\hat{A}$ with eigenvalue $a_m$, then $\lvert m\rangle$ is an eigenvector of $\hat{U}^{\dagger}\hat{A}\hat{U}$ with eigenvalue $a_m$! This is the active transformation version diagonalization procedure outlined above.

Note in this case that $\hat{A}$ is acting on the primed vectors. The OP, however, starts with $\hat{A}$ acting on the unprimed vectors! This is exactly why the OP got the $\hat{U}$'s switched around. To really see what's going on, let's do the following.

Active transformations as a "change of basis"

Again, active transformations change vectors into other vectors; they don't change the basis (in the sense of representing the same vector in another basis). However, we could interpret the active transformation as a change of basis in the following way. Again, starting from $$ \hat{U} \lvert i\rangle = \lvert i' \rangle\,, $$ and $$ \hat{U}^{\dagger} \lvert i'\rangle = \lvert i \rangle\,, $$ let's ask the following question. Given an operator $\hat{A}$, can we find an operator $\hat{B}$ that acts on vectors expanded in the new basis in the "same way" as $\hat{A}$ acts on vectors expanded in the old basis? Let's make this more concrete by jumping to the answer. Consider $$ \hat{A}\lvert v \rangle = \sum_n \hat{A}\lvert n\rangle \langle n | v \rangle $$ and $$ \hat{U}\hat{A}\hat{U}^{\dagger}\lvert v \rangle = \sum_n \hat{U}\hat{A}\hat{U}^{\dagger}\lvert n'\rangle \langle n' | v \rangle\,. $$ In both cases, we act on the same vector, but we have written those as sums in different bases. Then, if $\hat{A}\lvert n\rangle = \sum_{i}A_{in} \lvert i\rangle$, then $$ \hat{A}\lvert v \rangle = \sum_n \hat{A}\lvert n\rangle \langle n | v \rangle =\sum_n \sum_{i}A_{in} \lvert i\rangle \langle n | v \rangle $$ and $$ \hat{U}\hat{A}\hat{U}^{\dagger}\lvert v \rangle = \sum_n \hat{U}\hat{A}\hat{U}^{\dagger}\lvert n'\rangle \langle n' | v \rangle = \sum_n \hat{U}\hat{A}\lvert n\rangle \langle n' | v \rangle = \sum_n \hat{U}\sum_{i}A_{in} \lvert i\rangle \langle n' | v \rangle = \sum_n \sum_{i}A_{in} \lvert i'\rangle \langle n' | v \rangle \,. $$ We can see that there is a sense in which the two operators act "the same" on this vector through the identical "matrix elements" $A_{in}$. If traced through carefully, we can see that this is the notion equivalent to the change-of-basis in the passive transformation picture. Note also that the transformation is indeed backwards! That is, we're doing $\hat{U}\hat{A}\hat{U}^{\dagger}$ instead of $\hat{U}^{\dagger}\hat{A}\hat{U}$.

Postscript

In my opinion, the first section about passive transformations is the most important thing to take in, here. I've gone through much more than is necessary in this post in order to do quantum mechanical calculations. That said, there is a moral. To do quantum mechanics, you really need to distinguish in your head the difference between the abstract ket vector and its representation as a column vector where we have assumed some basis for the space. You also need to keep separate in your head an operator and its matrix representation in some basis, and it's in my opinion also really important to distinguish linear operators and change-of-basis transformations as different things, even though we can represent them both as matrices.

Change of basis and diagonalization

There are 1 best solutions below

Related Questions in MATRICES

Related Questions in DIAGONALIZATION

Related Questions in CHANGE-OF-BASIS

Related Questions in ADJOINT-OPERATORS

Trending Questions

Popular # Hahtags

Popular Questions