Why $\max \left\{ {{x^T}Ax:x \in {R^n},{x^T}x = 1} \right\}$ is the largest real eigenvalue of A?

4.4k Views Asked by At

Let $A \in {M_n}(R)$ and A is symmetric.Why $\max \left\{ {{x^T}Ax:x \in {R^n},{x^T}x = 1} \right\}$ is the largest real eigenvalue of A?

4

There are 4 best solutions below

0
On

$A$ is symmetric so it can be orthogonally diagonalized using a unitary matrix $P$ and a diagonal matrix $D$ containing the eigenvalues as diagonal entries. $A = P D P^T$. Let $y = P^T x$, then $\max \{x^T A x: x^Tx = 1\} = \max \{ y^T D y: y^T y = 1 \}$ (using the unitarity of $P$). It is easily seen from the properties of $D$ that $\max \{ y^T D y: y^T y = 1 \}$ is equal to the largest diagonal entry of $D$ which is the largest eigenvalue of $A$.

0
On

You can try to maximize the function $f:\mathbb{R}{^n} \to \mathbb{R}$ given by $f(\mathbb{x}) = {\mathbb{x}^T}\mathbf{A}\mathbb{x}$ using standard calculus and see what you get. Since we are trying to maximize $f$ subject to the condition ${\mathbb{x}^T}\mathbb{x} = 1$, using the Lagrange multiplier technique, we need to find the critical points of $$g(\mathbb{x},\lambda) = {\mathbb{x}^T}\mathbf{A}\mathbb{x} - \lambda ({\mathbb{x}^T}\mathbb{x} - 1)$$ Deriving with respect to $\mathbb{x}=[x_1 x_2 \cdots x_n]^T$ and setting the derivative equal to zero $${{\partial g} \over {\partial \mathbb{x}}} = 0 \Leftrightarrow 2\mathbf{A}\mathbb{x} - 2\lambda \mathbb{x} = 0 \Leftrightarrow \mathbf{A}\mathbb{x} = \lambda \mathbb{x}$$ we see that $\mathbb{x}$ should be an eigenvector of $\mathbf{A}$ corresponding to an eigenvalue $\lambda$.

Plugging this information back into our function $f$, we get $f(\mathbb{x})={\mathbb{x}^T}\mathbf{A}\mathbb{x} = {\mathbb{x}^T}\lambda \mathbb{x} =\lambda {\mathbb{x}^T}\mathbb{x} = \lambda$ so in order to maximize $f$, we should take for $\lambda$ the largest eigenvalue of $\mathbf{A}$.

By ${{\partial g} \over {\partial \mathbb{x}}}$, I denote the vector of partial derivatives of $$g({x_1},{x_2},...,{x_n},\lambda) = \sum\limits_{i,j = 1}^n {{a_{ij}}{x_i}{x_j}} - \lambda \left( {\sum\limits_{l = 1}^n {x_l^2} - 1} \right)$$ with respect to every component of $\mathbb{x}$. For example, the $k-$th component of ${{\partial g} \over {\partial \mathbb{x}}}$ is given by $${{\partial g} \over {\partial {x_k}}} = \sum\limits_{i = 1}^n {{a_{ik}}{x_i}} + \sum\limits_{j = 1}^n {{a_{kj}}{x_j}} -2 \lambda x_k= 2\sum\limits_{i = 1}^n {{a_{ki}}{x_i}} -2 \lambda x_k$$ Here we use the fact that $\mathbf{A}$ is symmetric.

Note that the function $f$ is unbounded if the norm of $\mathbb{x}$ is not constrained.

0
On
  • Let $f(x)=x^TAx$. Then $f(x)$ is continuous on the compact set $C:=\{x\in\mathbb{R}^n:x^Tx=1\}$. By the Weierstrass's theorem $f(x)$ attains its maximum on $C$.

  • Since $A$ is symmetric, $A$ has $n$ real eigenvalue $\lambda_1, \lambda_2,\ldots,\lambda_n$. Put $$ \lambda:=\max\{\lambda_1, \lambda_2,\ldots,\lambda_n\}. $$

  • Let $u$ be the eigenvector of matrix $A$ with the respective eigenvalue $\lambda$ such that $u^Tu=1$.Then $$ \max\{x^TAx:x^Tx=1\}\geq u^TAu=u^t(\lambda u)=\lambda u^tu=\lambda. \quad \textbf{(1)} $$

  • By the diagonalization, there exists nonsingular matrix $B\in M_n(\mathbb{R})$ such that $B^{-1}=B^T$ and $$ \text{diag}(\lambda_1, \lambda_2,\ldots,\lambda_n)=BAB^{-1}. $$

  • Let $x$ such that $x^Tx=1$. Then, \begin{eqnarray*} x^TAx&=&x^T(B^{-1}BAB^{-1}B)x\\ &=&(x^TB^T)\text{diag}(\lambda_1, \lambda_2,\ldots,\lambda_n)(Bx)\\ &=&y^T\text{diag}(\lambda_1, \lambda_2,\ldots,\lambda_n)y \end{eqnarray*} where $y=Bx$.

  • Note that $y^Ty=(Bx)^TBx=x^TB^TBx=x^Tx=1$. Let $y=(y_1,y_2,\ldots,y_n)$. Then $$ 1=y^Ty=\sum_{i=1}^{n}y_i^2. $$ It follows that $$ x^TAx=y^T\text{diag}(\lambda_1, \lambda_2,\ldots,\lambda_n)y=\sum_{i=1}^{n}\lambda_iy_i^2\leq\lambda\sum_{i=1}^{n}y_i^2\leq\lambda.\quad \textbf{(2)} $$ Combining (1) and (2) we obtain $$ \max\{x^TAx:x^Tx=1\}=\max \{\lambda_1, \lambda_2,\ldots,\lambda_n\}. $$

3
On

So, here's an intuitive and somewhat geometric explaination of what is happening.

$1:$ For $\lambda$ to be an eigenvalue, we need to be able to find a vector $v$ such that $Av=\lambda v$

$2:$ We observe that $v$ is an eigenvector iff $v\over{||v||}$ is an eigenvector. So, we reduce the search space to the unit sphere.ie., vectors with norm $=1$. So, now we have to find $\lambda$ and $v$ such that $Av=\lambda v$ and $||v||=1$

$3:$ Since we are talking about symmetric matrices existence of eigenvalues is guaranteed. Our claim now is that the largest eigenvalue is given by $max_{||x||=1} x^TAx$.

$4:$Observe here that $y=Ax$ is a vector. It need not lie on the unit sphere. But we are projecting $y$ onto a vector $x$ in the unit sphere by taking $x^Ty$.

$5: $ When is this maximized ?

By the definition of standard dot product, if $\theta$ is the angle between $x$ and $y$, $x^Ty=||x||||y||cos(\theta)=||y||cos(\theta)$. Here, for a given $x$, this would have been maximized when $cos(\theta)=1$. Since the existence of eigenvalues is guaranteed, this definitely occurs at the eigenvectors.The maximum among these is the eigenvector corresponding to the largest eigenvalue.

As for vectors which are not eigenvectors,we can directly see that $x^Ty<||y||$. Now, because $A$ is diagonalizable and the eigenvectors $v_1,...,v_n$ form a basis, we can write $x=\alpha_1v_1 + ... + \alpha_nv_n$.Here, we denote the largest eigenvalue by $\lambda_1$ and the corresponding eigenvector to be $v_1$ . So, $Ax=\alpha_1\lambda_1v_1 + ... + \alpha_n\lambda_nv_n$. $||y||=||Ax||\leq||\alpha_1\lambda_1v_1 + ... + \alpha_n\lambda_1v_n||=\lambda_1||x||=\lambda_1$. So, those which are not eigenvectors donot exceed this value.