Why does this trick to derive the formula for $[A^n,B]$ in terms of repeated commutators work so well?

Question

Why does this trick to derive the formula for $[A^n,B]$ in terms of repeated commutators work so well?

124 Views Asked by Bumbble Comm At 10 May 2026 - 10:10

It is a known result that, given generically noncommuting operators $A,B$, we have $$ A^n B=\sum_{k=0}^n \binom{n}{k} \operatorname{ad}^k(A)(B) A^{n-k},\tag A $$ where $\operatorname{ad}^k(A)(B)\equiv[\underbrace{A,[A,[\dots,[A}_k,B]\dots]] $.

This can be proved for example via induction with not too much work.

However, while trying to get a better understanding of this formula, I realised that there is a much easier way to derive it, at least on a formal, intuitive level.

The trick

Let $\hat{\mathcal S}$ and $\hat{\mathcal C}$ (standing for "shift" and "commute", respectively) denote operators that act on expressions of the form $A^k D^j A^\ell$ (denoting for simplicity $D^j\equiv\operatorname{ad}^j(A)(B)$) as follows:

\begin{align} \hat{\mathcal S} (A^k D^j A^\ell) &= A^{k-1} D^j A^{\ell+1}, \\ \hat{\mathcal C} (A^{k} D^{j} A^\ell) &= A^{k-1} D^{j+1} A^{\ell}. \end{align} In other words, $\hat{\mathcal S}$ "moves" the central $D$ block on the left, while $\hat{\mathcal C}$ makes it "eat" the neighboring $A$ factor.

It is not hard to see that $\hat{\mathcal S}+\hat{\mathcal C}=\mathbb 1$, which is but another way to state the identity $$A[A,B]=[A,B]A+[A,[A,B]].$$ Moreover, crucially, $\hat{\mathcal S}$ and $\hat{\mathcal C}$ commute. Because of this, I can write

$$A^n B=(\hat{\mathcal S}+\hat{\mathcal C})^n (A^n B)=\sum_{k=0}^n\binom{n}{k} \hat{\mathcal S}^{n-k} \hat{\mathcal C}^{k}(A^n B),$$ which immediately gives me (A) without any need for recursion or other tricks.

The question

Now, this is all fine and dandy, but it leaves me wondering as to why does this kind of thing work? It looks like I am somehow bypassing the nuisance of having to deal with non-commuting operations by switching to a space of "superoperators", in which the same operation can be expressed in terms of commuting "superoperators".

I am not even sure how one could go in formalising this "superoperators" $\hat{\mathcal S},\hat{\mathcal C}$, as they seem to be objects acting on "strings of operators" more than on the elements of the operator algebra themselves.

Is there a way to formalise this way of handling the expressions? Is this a well-known method in this context (I had never seen it but I am not well-versed in this kinds of manipulations)?

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Accepted Answer

To fix some notation, suppose that the operators $A$ and $B$ belong to a vector space $V$, and that we are working with strings of $m$ operators. (For example, in $A^k D^j A^l$ we have $m = k + j + l$). Rather than writing a product $A^k B^j$ as an element of $V$, we can instead write $$ A^{\otimes k} \otimes B^{\otimes j} = \underbrace{A \otimes \cdots \otimes A}_k \otimes \underbrace{B \otimes \cdots \otimes B}_j \in V^{\otimes m}$$ an element of the $m$th tensor power of $V$. We have a linear multiplication map $\mu: V^{\otimes m} \to V$, which is just composition of operators, so for example $\mu(A^{\otimes k} \otimes B^{\otimes j}) = A^k B^j$. So the idea will be to define $\hat{\mathcal{S}}$ and $\hat{\mathcal{C}}$ as linear operators $V^{\otimes m} \to V^{\otimes m}$, check that they commute and add to give the identity, and then finally apply them to a particular tensor $A^{\otimes n} \otimes B$, which will give an identity much like the one you're after. Applying the multiplication $\mu$ will then give the exact identity.

Defining the operators is not too hard. We can take $\hat{\mathcal{S}}, \hat{\mathcal{C}} : V^{\otimes m} \to V^{\otimes m}$ to be defined by the formulas $$ \begin{aligned} \hat{\mathcal{S}}(v_1 \otimes v_2 \otimes \cdots \otimes v_m) &= v_2 \otimes \cdots \otimes v_m \otimes v_1 \\ \hat{\mathcal{C}}(v_1 \otimes v_2 \otimes \cdots \otimes v_m) &= v_1 \otimes v_2 \otimes \cdots \otimes v_m - v_2 \otimes \cdots \otimes v_m \otimes v_1 \end{aligned}$$ We then check that these formulas do the right thing, for example we need to make sure that $\mu(\hat{\mathcal{C}}^k (A^{\otimes m-1} \otimes B)) = A^{m-k-1} D^k$ and so on.

With those definitions, it is easy to see that $\hat{\mathcal{C}} = \mathbb{1} - \hat{\mathcal{S}}$, and so they also commute, since $\hat{\mathcal{C}}\hat{\mathcal{S}} = \hat{\mathcal{S}}\hat{\mathcal{C}} = \hat{\mathcal{S}} - \hat{\mathcal{S}}^2$. So we can write $$ A^{\otimes n} \otimes B = (\hat{\mathcal{S}} + \hat{\mathcal{C}})^n (A^{\otimes n} \otimes B) = \sum_{k=0}^n \binom{n}{k} (\hat{\mathcal{S}}^{n-k} \hat{\mathcal{C}}^k) (A^{\otimes n} \otimes B)$$ and finally applying $\mu$ on both sides gives the formula you are after.

Why does this trick to derive the formula for $[A^n,B]$ in terms of repeated commutators work so well?

The trick

The question

There are 1 best solutions below

Related Questions in LINEAR-ALGEBRA

Related Questions in MATRICES

Related Questions in OPERATOR-ALGEBRAS

Related Questions in NONCOMMUTATIVE-ALGEBRA

Trending Questions

Popular # Hahtags

Popular Questions