I have a question about when one can group terms in a power series. Specifically in a power series of a matrix. So a teacher asked us to prove that if two matrices $A,B \in \mathbb{C}_{n\times n}$ commute then $e^Ae^B=e^{A+B}$. One person in my class was able to do it but there's a problem, his proof went like so, by the definition of a matrix exponent we have: $$e^Ae^B=(\sum_{j=0}^{\infty}\frac{A^j}{j!})(\sum_{i=0}^{\infty}\frac{B^i}{i!})$$ But when we expand the sum and group to our convenience we get $$ (I)+(A+B)+(\frac{A^2}{2!}+AB+\frac{B^2}{2!})+(\frac{A^3}{3!}+\frac{A^2}{2!}B+A\frac{B^2}{2!}+\frac{B^3}{3!})+... $$ However, this can be written as $$ \sum_{s=0}^{\infty}\sum_{t=0}^s\frac{A^{s-t}}{(s-t)!}\frac{B^t}{t!} $$
Without going into much more detail, when starting out with $e^{A+B}$ we are able to reach the exact same expression as above. The two questions that arise at this point are:
- Why are we able to group the terms and not violate any rules? As an example where we break the rules is in the series $1-1+1-1+1-1+...=(1-1)+(1-1)+...=0$ but of course, this is not true.
- Why is it that by reaching the same expression, we are able to conclude that the series are equal?
I would like to believe that the answer to both of these questions is because the power series always converge but I am looking for more proof that this is the reason.
Question 1: The example series you provided is not convergent, not even conditionally. Working with it as a value is nonsense. If you do start with a convergent sequence (possibly conditional), then you always have associativity, in the sense that you're free to add parentheses as you like: \begin{align*} &a_1 + a_2 + a_3 + a_4 + a_5 + a_6 + a_7 + a_8 + \ldots \\ = ~ &a_1 + (a_2 + a_3) + a_4 + (a_5 + a_6 + a_7 + a_8) + \ldots \end{align*} If your convergent sequence is absolutely convergent, then you additionally get commutativity. Not only can you group the terms as you like, but you can change their order as you like too (conditional convergence is characterised by its spectacular failure of commutativity: a real conditionally convergent series can be rearranged to be non-convergent, or to converge to any given value).
In order to talk about conditional convergence for matrices (or in other vector spaces), one needs a norm, which is a "length" function. The definition of a norm is as follows:
A norm is a function $$\| \cdot \| : V \to [0, \infty),$$ where $V$ is a real or complex vector space, with the following properties:
On a non-trivial real/complex vector space, there will always be infinitely many such norms. These norms give us a measure of distance between two points; in particular, the distance from $v$ to $w$ is defined to be $\|v - w\|$.
This distance function allows us to perform analysis. We can then sensibly define the notion of a limit. Since we already have a linear structure, we can sensibly define the notion of an infinite series, and talk about convergence (absolute or conditional). Without a norm, the expression $$\sum_{i=0}^\infty \frac{A^i}{i!}$$ is nonsense. There needs to be a sense in which the partial sums "become close" to something in the space. More generally, since we cannot manually add infinitely many vectors (e.g. matrices), there must be some definition to say what the sum comes to.
We can use norms to define absolute convergence too. We say $\sum v_n$ converges absolutely if the real series $\sum \|v_n\|$ converges. In finite dimensions, this always implies that $\sum v_n$ converges. In infinite dimensions, this is not always the case, and absolute convergence implying convergence is equivalent to the space being Cauchy-complete (i.e. all Cauchy sequences converge). The space of matrices is finite-dimensional, so this is not a problem.
Another point, which I've danced around, is which norm do we use for matrices? If there are infinitely many norms to work with, and our sense of distance depends on the norm, then there's a real possibility that choosing different norms might affect convergence! That is, under one norm, a series might converge to $A$, under another the same series might converge to $B$, and under a third norm, the series might not converge at all! Again, this is a problem only in infinite-dimensional spaces. In finite-dimensional spaces, the choice of norm doesn't affect convergence.
So, in conclusion,
Question 2: "Reaching the same expression", in this case, means manipulating a convergent series in such a way that the series' sums are the same matrix. It means that these expressions, which represent an actual matrix, represent the same matrix. This, by definition, make the series equal.