I have looked extensively for a proof on the internet but all of them were too obscure. I would appreciate if someone could lay out a simple proof for this important result. Thank you.
Proof that the trace of a matrix is the sum of its eigenvalues
245.6k Views Asked by Bumbble Comm https://math.techqa.club/user/bumbble-comm/detail AtThere are 7 best solutions below
On
Let $A$ be a matrix. It has a Jordan Canonical Form, i.e. there is matrix $P$ such that $PAP^{-1}$ is in Jordan form. Among other things, Jordan form is upper triangular, hence it has its eigenvalues on its diagonal. It is therefore clear for a matrix in Jordan form that its trace equals the sum of its eigenvalues. All that remains is to prove that if $B,C$ are similar then they have the same eigenvalues.
On
Trace is preserved under similarity and every matrix is similar to a Jordan block matrix. Since the Jordan block matrix has its eigenvalues on the diagonal, its trace is the sum (with multiplicity) of its eigenvalues.
On
I'll try to show it another way. We know that if we have a polynomial $x^n+b_{n-1} x^{n-1} + \dots +b_1 x+ b_0$, then $(-1)^{n-1} b_{n-1}$ is the sum of the roots of this polynomial. (So-called Vieta's formulas) In our case, the polynomial is $\det(tI-A)$ and we have $(-1)^{n-1} b_{n-1}=\lambda_1+\lambda_2+\dots+\lambda_n$.
$\def\S{\mathcal{S}_n}$ Let $\S$ denote all the permutations of the set $\{1,2,\dots,n\}$. Then by definition $$ \det M = \sum_{\pi\in\S} m_{1,\pi(1)} m_{2,\pi(2)} \dots m_{n,\pi(n)} \operatorname{sgn}\pi, $$ where $\operatorname{sgn}\pi$ is either $+1$ or $-1$ and it is $+1$ for the identity permutation (we don't need to know more now).
Consider $M=tI-A$. To get the power $t^{n-1}$ for a permutation, we need this permutation to choose at least $n-1$ diagonal elements, i.e., to have $\pi(i)=i$ for at least $n-1$ values of $i$. However, once you know the value of a permuation on $n-1$ inputs, you know the last one as well. This means, that to get the coefficient of $t^{n-1}$, we need to consider only the identity permutation.
So far we got that $b_{n-1}$ is the coefficient of $t^{n-1}$ in $(t-a_{1,1})(t-a_{2,2})\dots(t-a_{n,n})$ (this is the term of the sum above corresponding to the identity permutation). Therefore $(-1)^{n-1}b_{n-1} = a_{1,1}+a_{2,2}+\dots+a_{n,n}=\operatorname{Tr}A$.
On
Here is another proof. First of all, by definition, we have that the characteristic polynomial of the $n\times n$ matrix $A=[a_{ij}]$ is given by $P_A(x)=\det(xI_n-A)$. Let $P_A(x)=x^n-b_1x^{n-1}+b_2x^{n-2}-\dots$. By Viete's formula the sum of eigenvalues is $b_1$. We have to prove that $b_1=\hbox{trace}(A)$. Substituting $x$ with $\frac{1}{x}$ for every real non-zero $x$ we get $$\det\left(\frac{1}{x}\left(I_n-xA\right)\right)=\frac{1-b_1x+b_2x^{2}-\dots}{x^n},$$ or equivalently $$\det(I_n-xA)=1-b_1x+b_2x^2-\dots$$ for any non-zero real $x$. But then the left side and right side polynomials from the above equation coincide for all real $x$. A short explanation: if for two polynomials $f$ and $g$ we have $f(x)=g(x)$ for any non-zero $x$, then the polynomial $h(x):=f(x)-g(x)$ has an infinity of zeroes, thus being the identically zero polynomial; it follows that $f(x)=g(x)$ for all $x$. Let's denote now $$f(x):=\det(I_n-xA)$$ and $$g(x):=1-b_1x+b_2x^2-\dots.$$ We have seen that $f$ and $g$ are equal functions (polynomials). We then have $f'(x)=g'(x)$ for all $x$. Obviously, $g'(x)=-b_1+2b_2x-\dots$, therefore $g'(0)=-b_1$. On the other side, from $$f(x)=\left|\begin{array}{cccc} 1-a_{11}x&-a_{12}x&\dots&-a_{1n}x\\ -a_{21}x&1-a_{22}x&\dots&-a_{2n}x\\ \dots&\dots&\dots&\dots\\ -a_{n1}x&-a_{n2}x&\dots&1-a_{nn}x\end{array}\right|$$ by the rule of differentiating determinants we get $$f'(x)=\left|\begin{array}{cccc} -a_{11}&-a_{12}&\dots&-a_{1n}\\ -a_{21}x&1-a_{22}x&\dots&-a_{2n}x\\ \dots&\dots&\dots&\dots\\ -a_{n1}x&-a_{n2}x&\dots&1-a_{nn}x\end{array}\right|+\dots+\left|\begin{array}{cccc} 1-a_{11}x&-a_{12}x&\dots&-a_{1n}x\\ -a_{21}x&1-a_{22}x&\dots&-a_{2n}x\\ \dots&\dots&\dots&\dots\\ -a_{n1}&-a_{n2}&\dots&-a_{nn}\end{array}\right|.$$ It follows that $$f'(0)=\left|\begin{array}{cccc} -a_{11}&-a_{12}&\dots&-a_{1n}\\ 0&1&\dots&0\\ \dots&\dots&\dots&\dots\\ 0&0&\dots&1\end{array}\right|+\dots+\left|\begin{array}{cccc} 1&0&\dots&0\\ 0&1&\dots&0\\ \dots&\dots&\dots&\dots\\ -a_{n1}&-a_{n2}&\dots&-a_{nn}\end{array}\right|=$$ $$=-(a_{11}+\dots+a_{nn})=-\hbox{trace}(A).$$ Since we have $f'(0)=g'(0)$ we get $b_1=\hbox{trace}(A)$.
On
By the Schur decomposition, any matrix $A$ is unitarily similar to an upper triangular matrix $T$. Being similar, $A$ and $T$ have the same trace and the same eigenvalues. Moreover, the diagonal entries of $T$ are equal to its eigenvalues (since $T$ is triangular). The stated result follows by calculating the trace of $T$. See https://www.statlect.com/matrix-algebra/properties-of-eigenvalues-and-eigenvectors.
On
Let $\mathbf{A}$ be a $k \times k$ symmetric matrix and $\mathbf{x}$ be a $k \times 1$ vector. Then
(a) $\mathbf{x'Ax}$ = tr($\mathbf{x'Ax}$) = tr($\mathbf{Axx'}$)
(b) tr($\mathbf{A}$) = $\Sigma_{i=1}^k \lambda_i$, where the $\lambda_i$ are the eigenvalues of $\mathbf{A}$.
For Part a, we note that $\mathbf{x'Ax}$ is a scalar, so $\mathbf{x'Ax}$ = tr($\mathbf{x'Ax}$). We know that tr($\mathbf{BC}$) = tr($\mathbf{CB}$) for any two matrics $\mathbf{B}$ and $\mathbf{C}$ of dimensions $m \times k$ and $k \times m$, respectively. This follows because $\mathbf{BC}$ has $\Sigma_{j=1}^k b_{ij}c_{ji}$ as its ith diagonal element, so tr($\mathbf{BC}$) = $\Sigma_{i=1}^m ( \Sigma_{j=1}^k b_{ij}c_{ji})$. Similarly, the jth diagonal element of $\mathbf{CB}$ is $\Sigma_{i=1}^m c_{ji}b_{ij}$, so tr($\mathbf{CB}$) = $\Sigma_{j=1}^k ( \Sigma_{i=1}^m c_{ji}b_{ij})$ = $\Sigma_{i=1}^m ( \Sigma_{j=1}^k b_{ij}c_{ji})$ = tr($\mathbf{BC}$).
Let $\mathbf{x'}$ be the matrix $\mathbf{B}$ with m = 1, and let $\mathbf{Ax}$ play the role of the matrix $\mathbf{C}$. Then tr($\mathbf{x'(Ax)}$) = tr($\mathbf{(Ax)x'}$), and the result follows.
Part b is proved by using the spectral decomposition to write $\mathbf{A=P' \Lambda P}$, where $\mathbf{PP'=I}$ and $\mathbf{\Lambda}$ is a diagonal matrix with entries $\lambda_1$,$\lambda_2$,...,$\lambda_k$. Therefore, tr($\mathbf{A}$) = tr($\mathbf{P' \Lambda P}$) = tr($\mathbf{\Lambda P P'}$) = tr($\mathbf{\Lambda}$) = $\lambda_1 + \lambda_2 + \lambda_k$.
These answers require way too much machinery. By definition, the characteristic polynomial of an $n\times n$ matrix $A$ is given by $$p(t) = \det(A-tI) = (-1)^n \big(t^n - (\text{tr} A) \,t^{n-1} + \dots + (-1)^n \det A\big)\,.$$ On the other hand, $p(t) = (-1)^n(t-\lambda_1)\dots (t-\lambda_n)$, where the $\lambda_j$ are the eigenvalues of $A$. So, comparing coefficients, we have $\text{tr}A = \lambda_1 + \dots + \lambda_n$.