Determine the coefficient of polynomial det(I + xA)

496 Views Asked by At

Given matrix an n-by-n matrix $A$ and its $n$ eigenvalues. How do I determine the coefficient of the term $x^2$ of the polynomial given by

$q(x) = \det(I_n + xA)$

3

There are 3 best solutions below

3
On BEST ANSWER

$$\det(I+ x A) = x^n \det (A + \frac 1 x I) = x^n\left(\lambda_1 + \frac 1 x\right)\left(\lambda_2 + \frac 1 x\right)\cdots \left(\lambda_n + \frac 1 x\right) $$

0
On

Translating result appeared on the wiki page of Characteristic polynomial,

$$\det(I_n + x A ) = \sum_{k=0}^n x^k \text{tr}(\wedge^k A)$$

where $\text{tr}(\wedge^k A)$ is the trace of the $k^{th}$ exterior power of $A$ which can be evaluated explicitly as the determinant of the $k \times k$ matrix,

$$\text{tr}(\wedge^k A) = \frac{1}{k!} \left|\begin{matrix} \text{tr}A & k-1 & 0 & \cdots\\ \text{tr}A^2 & \text{tr}A & k-2 & \cdots\\ \vdots & \vdots & & \ddots & \vdots\\ \text{tr}A^{k-1} & \text{tr}A^{k-2} & & \cdots & 1 \\ \text{tr}A^k & \text{tr}A^{k-1} & & \cdots & \text{tr}A \end{matrix}\right| $$ In the special case of $k = 2$, the coefficient of $x^2$ in $\det(I_n + x A)$ is equal to $$\text{tr}(\wedge^2 A) = \frac{1}{2!} \left|\begin{matrix} \text{tr}A & 1\\ \text{tr}A^2 & \text{tr}A\\ \end{matrix}\right| = \frac12 \left[(\text{tr}A)^2 - \text{tr}(A^2)\right]$$ In terms of the eigenvalues of $\lambda_1, \ldots, \lambda_n$ of $A$, this is equal to $$\frac12 \left( ( \sum_{i=1}^n \lambda_i )^2 - \sum_{i=1} \lambda_i^2 \right) =\sum_{1\le i < j \le n} \lambda_i\lambda_j$$ Similarly, the coefficient of $x^k$ in $\det(I_n + xA)$ will have following general form: $$\sum_{1\le i_1 < i_2 < \cdots < i_k \le n} \lambda_{i_1}\lambda_{i_2} \cdots \lambda_{i_k}$$ i.e sum of all possible combination of product of $k$ eigenvalues.

0
On

When this question was first posted, if I recall correctly, there was a discussion in the comment thread concerning whether or not it is necessary to additionally assume $A$ is diagonalizable in order to obtain (nice) formulas for the coefficients of $x^k$, $0 \le k \le n$, in the polynomial $q_n(x) = \det(I_n + xA)$. It appears that this issue has since been resolved to the satisfaction of said commentators via the posted solutions, who apparently then deleted the corresponding comments. Nevertheless . . .

Here's another way to get at it which works whether $A$ is diagonalizable or not; this solution, however, does make use of well-known general properties of determinants and the eigenvalues of matrices. It also relies on a formula relating products $\prod_1^n (1 + a_i)$ to the elementary symmetric functions $\sigma_k(a_1, a_2, \ldots, a_n)$:

$\prod_1^n (1 + a_i) = \sum_0^n \sigma_k(a_1, a_2, \ldots, a_n); \tag{1}$

here we take

$\sigma_0(a_1, a_2, \ldots, a_n) = 1, \tag{2}$

$\sigma_1(a_1, a_2, \ldots, a_n) = a_1 + a_2 + \ldots + a_n = \sum_1^n a_i, \tag{3}$

$\sigma_2(a_1, a_2, \ldots, a_n) = \sum_{1 \le i < j \le n} a_i a_j, \tag{4}$

$\sigma_3(a_1, a_2, \ldots , a_n) = \sum_{1 \le i < j < k \le n} a_i a_j a_k, \tag{5}$

and so forth, right on through to

$\sigma_n(a_1, a_2, \ldots, a_n) = a_1 a_2 \ldots a_n = \prod_1^n a_i; \tag{6}$

indeed, we may in general write

$\sigma_k(a_1, a_2, \ldots, a_n) = \sum_{1 \le i_1 < i_2 < \ldots < i_k \le n} a_{i_1} a_{i_2} \ldots a_{i_k}. \tag{7}$

(7) is taken (consistently) in the sense that $\sigma_k = 0$ for $k > n$, since the condition $1 \le i_1 < i_2 < \ldots < i_k \le n$ cannot be satisfied under this condition; this and more may be found in the linked citing to the wikipedia article Elementary symmetric polynomial.

I will present a proof of formula (1) below; first, let's see how it may be deployed to address the problem at hand. Suppose then that the $n$ eigenvalues of $A$ are denoted $\lambda_1$, $\lambda_2$, $\ldots$, $\lambda_n$; then the eigenvalus of $xA$ are precisely the $x\lambda_i$, $1 \le i \le n$, and those of $I + xA$ are precisely the $n$ quantities $1 + x\lambda_i$. These assertions follow readily from the basic definitions: for any eigenvalue $\mu$ of any square matrix $B$ we of course have

$Bw = \mu w \tag{8}$

for some vector $w \ne 0$; thus, for scalars $a, b$,

$(aB + bI)w = aBw + cw = (a\mu + b)w; \tag{9}$

$a\mu + b$ is thus an eigenvalue of $aB + bI$ corresponding to $w$. If $a \ne 0$, the situation can be reversed; indeed, (9) yields, by subtracting $bIw = bw$,

$aBw= a\mu w, \tag{10}$

and hence, dividing through by $a$, (8). Since the eigenvalues of $I + xA$ are as stated, and the determinant of a matrix is the product of its eigenvalues,

$\det(I + xA) = \prod_1^n (1 + x\lambda_i); \tag{11}$

we apply (1) to obtain

$\det(I + xA) = \sum_0^n \sigma_i(x\lambda_1, x\lambda_2, \ldots, x\lambda_n); \tag{12}$

the solution is driven to completion with the recognition that each $\sigma_k$ is in fact a homogeneous polynomial of degree $k$ in the variables $a_1, a_2, \dots, a_n$ (or if one prefers, the $\lambda_1, \lambda_2, \ldots, \lambda_n$); thus

$\sigma_k(x\lambda_1, x\lambda_2, \ldots, x\lambda_n) = x^k\sigma_k(\lambda_1, \lambda_2, \ldots, \lambda_n); \tag{13}$

when (13) is substituted into (12) we find

$\det(I + xA) = \sum_0^n x^i \sigma_i(\lambda_1, \lambda_2, \ldots, \lambda_n); \tag{14}$

this shows that the coefficient of $x^k$ in $q(x) = \det(I + xA)$ is in fact $\sigma_k(\lambda_1, \lambda_2, \ldots, \lambda_n)$; in particular, the coefficient of $x^2$ is

$\sigma_2(\lambda_1, \lambda_2, \ldots, \lambda_n) = \sum_{1 \le i < j \le n} \lambda_i \lambda_j; \tag{15}$.

We have presented expressions for the coefficient of the $x^k$ in terms of the eigenvalues of $A$; it is also possible, based upon what we have accomplished, to express these coefficients directly in terms of the entries of $A$ itself via its characteristic polynomial $\det(\lambda I - A)$, in the usual way by comparing the expanded determinant with the known factorization

$\det(\lambda I - A) = \prod_1^n (\lambda - \lambda_i), \tag{16}$

which holds since the $\lambda_i$ are the eigenvalues of $A$. It is well-known that

$\prod_1^n (\lambda - \lambda_i) = \sum_0^n (-1)^k \sigma_k(\lambda_1, \lambda_2, \ldots, \lambda_n) \lambda^{n - k}, \tag{17}$

so combining (16) and (17),

$\det(\lambda I - A) = \sum_0^n (-1)^k \sigma_k(\lambda_1, \lambda_2, \ldots, \lambda_n) \lambda^{n - k}, \tag{18}$

if the coefficients of $\lambda^{n - k}$ are compared on either side of (16), one may find expressions for the $\sigma_k(\lambda_1, \lambda_2, \ldots, \lambda_n)$ directly in terms of the entries $a_{ij}$ of the matrix $A$; for example, we have

$\sigma_1(\lambda_1, \lambda_2, \ldots, \lambda_n) = \sum_1^n \lambda_i = \sum_i^n a_{ii} = \text{trace}(A), \tag{19}$

$\sigma_n(\lambda_1, \lambda_2, \ldots, \lambda_n) = \prod_1^n \lambda_i = (-1)^n \det(A); \tag{20}$

the remaining $\sigma_k(\lambda_1, \lambda_2, \ldots, \lambda_n)$, $2 \le k \le n$, also all have expressions in terms of the $a_{ij}$; they will not be presented here, though it is clear they may be explicitly obtained from equation (18).

Appendix: Derivation of Formula (1): Formula (1) holds in any commutative ring with unit $R$, is easily proved by induction on the number of variables $n$. We have

$\prod_1^1 (1 + a_i) = 1 + a_1; \tag{21}$

$\prod_1^2 (1 + a_i) = (1 + a_1)(1 + a_2) = 1 + a_1 + a_2 + a_1 a_2 = \sum_0^2 \sigma_k(a_1, a_2); \tag{22}$

$\prod_1^3 (1 + a_i) = (1 + a_1)(1 + a_2)(1 + a_3)$ $= 1 + a_1 + a_2 + a_3 + a_1 a_2 + a_1 a_3 + a_2 a_3 + a_1 a_2 a_3= \sum_0^3 \sigma_k(a_1, a_2, a_3); \tag{23}$

(21)-(23) suggest that we might have

$\prod_1^n (1 + a_i) = \sum_0^n \sigma_k(a_1, a_2, \ldots, a_n), \tag{24}$

and indeed a simple induction based upon (21)-(23) reveals that this (24) is indeed the case; for assuming that

$\prod_1^m (1 + a_i) = \sum_0^m \sigma_k(a_1, a_2, \ldots, a_m) \tag{25}$

for some positive $m \in \Bbb Z$, we have

$\prod_1^{m + 1} (1 + a_i) = (\prod_1^m (1 + a_i))(1 + a_{m + 1})$ $= (\sum_0^m \sigma_k(a_1, a_2, \ldots, a_m))(1 + a_{m + 1})$ $= \sum_0^m \sigma_k(a_1, a_2, \ldots, a_m) + \sum_0^m \sigma_k(a_1, a_2, \ldots, a_m) a_{m + 1}; \tag{26}$

furthermore, it is easy to see that

$\sigma_k(a_1, a_2, \ldots a_{m + 1}) = \sigma_k(a_1, a_2, \ldots a_m) + \sigma_{k - 1}(a_1, a_2, \ldots a_m)a_{m + 1}, \tag{27}$

since $\sigma_k(a_1, a_2, \ldots a_{m + 1})$ is the sum of all $k$-fold products of distinct $a_i$, $1 \le i \le m + 1$ and if we inspect the summands on the right, we see that $\sigma_k(a_1, a_2, \ldots a_m)$ is the sum of the $k$-fold products of distinct $a_i$ with $1 \le i \le m$ whereas the term $\sigma_{k - 1}(a_1, a_2, \ldots a_m)a_{m + 1}$ is composed precisely of all $k$-fold products with precisely one factor of $a_{m + 1}$; together, these two groups cover all terms occurring in $\sigma_k(a_1, a_2, \ldots, a_{m + 1})$. Thus, using (27) in (26),

$\sum_0^m \sigma_k(a_1, a_2, \ldots, a_m) + \sum_0^m \sigma_k(a_1, a_2, \ldots, a_m) a_{m + 1}$ $= \sum_0^m \sigma_k(a_1, a_2, \ldots, a_m) + \sum_1^{m + 1} \sigma_{k - 1}(a_1, a_2, \ldots, a_m) a_{m + 1}$ $= \sigma_0(a_1, a_2, \ldots, a_m) + \sum_1^m \sigma_k(a_1, a_2, \ldots, a_m)$ $+ \sum_1^m \sigma_{k - 1}(a_1, a_2, \ldots, a_m) a_{m + 1} + \sigma_m(a_1, a_2, \ldots, a_m) a_{m + 1}$ $= \sigma_0(a_1, a_2, \ldots, a_m) + \sum_1^m (\sigma_k(a_1, a_2, \ldots, a_m) + \sigma_{k - 1}(a_1, a_2, \ldots, a_m) a_{m + 1})$ $+ \sigma_m(a_1, a_2, \ldots, a_m) a_{m + 1}$ $= \sigma_0(a_1, a_2, \ldots, a_m) + \sum_1^m \sigma_k(a_1, a_2, \ldots, a_{m + 1}) + \sigma_{m + 1}(a_1, a_2, \ldots, a_{m + 1})$ $= \sum_0^{m + 1} \sigma_k(a_1, a_2, \ldots, a_{m + 1}) \tag{28}$

since $\sigma_0 (a_1, a_2, \ldots, a_m) = 1$ no matter what the value of $m$. Combining (26) with (28) completes the induction and shows that (24) holds for all integers $n \ge 1$. This completes the demonstration of formula (1). End of Appendix.