What does it mean to minimize a matrix

Question

What does it mean to minimize a matrix

1.5k Views Asked by Bumbble Comm At 17 May 2026 - 4:41

In Gilbert Strang's 'Introduction to Applied Mathematics', in chapter 2.5 (Least Squares Estimation and the Kalman Filter), in proof 2I, he talks about 'minimizing the covariance matrix'. I can't determine what the criteria is for minimizing a matrix. Does anyone have insight to lend?

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Accepted Answer

Strang’s style is quite informal and I don’t particularly care for it (though it apparently appeals to some people). In this case, a more formal presentation would have certainly spelled out what he meant. Looking over his book's proof of 2I, I would say that Strang most likely means minimization of the matrix $P$ in the spectral norm over all matrices $L$ satisfying $LA=I$. Here’s why I think so:

In Strang’s notation, $V$ is the covariance matrix which is symmetric positive semi-definite. The matrix he says he wants to minimize has the form $P=LVL^T$, which (following some context-specific simplifications) he then breaks down as $P= L_0V{L_0}^T+ (L-L_0)V(L-L_0)^T$. $P$ is clearly a real symmetric (hence normal) positive semi-definite matrix which implies that $$\left\| P \right\|_2 = \sigma _{\max } \left( P \right) = \lambda_{\max } \left( P \right) = \mathop {\max }\limits_{\left\| x \right\|_2 = 1} x^T Px.$$

Note that $L$ is the only free parameter in Strang’s expression for $P$. If we assume that Strang is attempting to minimize the spectral norm of $P$ over all $L$ satisfying $LA=I$, then this is accomplished by finding $L$ such that $\mathop {\min }\limits_{\{ L|LA = I\} } \left\| P \right\|_2 = \mathop {\min }\limits_{\{ L|LA = I\} } \mathop {\max }\limits_{\left\| x \right\|_2 = 1} x^T Px$. That is, $$\mathop {\min }\limits_{\{ L|LA = I\} } \left\| P \right\|_2 = \mathop {\min }\limits_{\{ L|LA = I\} } \mathop {\max }\limits_{\left\| x \right\|_2 = 1} \left\{ {x^T L_0 VL_0^T x + x^T (L - L_0 )V(L - L_0 )^T x} \right\}$$

The second quadratic form, the only component involving $L$, is clearly bounded below by $0$ and this bound is achieved by taking $L=L_0$. This gives $P = L_0 VL_0^T$ which is what Strang wanted to show.

~~As an aside, in looking over his proof, I noticed that Strang has a minor typo in one of the intermediate steps. He writes~~

$$(L - L_0 )VL_0 ^T = (L - L_0 )VV^{ - 1} A(A^T V^{ - 1} A)^{ - 1}$$

when he should have written

~~$$(L - L_0 )VL_0 ^T = (L - L_0 )VV^{ - 1} A\left\{ {(A^T V^{ - 1} A)^{ - 1} } \right\}^T.$$~~

What does it mean to minimize a matrix

There are 1 best solutions below

Related Questions in LINEAR-ALGEBRA

Related Questions in COVARIANCE

Related Questions in LEAST-SQUARES

Trending Questions

Popular # Hahtags

Popular Questions