Erratum in Boyd and Vanderberghe p. 30?

43 Views Asked by At

In Boyd and Vandenberghe's "Convex optimization," page $30$, it's stated that:

A related family of convex sets is the ellipsoids, which have the form $$\mathcal{E}=\{x\mid (x-x_c)^TP^{-1}(x-x_c)\leq 1\},$$ where $P=P^T\succ 0$, i.e., $P$ is symmetric and positive definite.

I understand that $P$ needs to be positive definite, but I fail to see that a symmetric $P$ with $P\succ 0$ is equivalent to positive definiteness. Also no mention is made of $P$ being non-singular, but then $P^{-1}$ is used in the definition. Finally, I believe that requiring the off-diagonal elements of $P$ to be strictly positive prevents the axes of the elipsoid to be parallel to the axes of $\Bbb R^n$, i.e., excludes some relevant ellipsoids from the definition.

The definition of an ellipsoid in Wikipedia only calls for $P$ to be symmetric and positive-definite and does not mention $P^{-1}$:

$$\mathcal{E}=\{x\mid (x-x_c)^TP(x-x_c)\leq 1\}.$$

Am I right to understand that the rank of $P$ will be equal to the number of non-zero length axes in the ellipsoid?

(I do not know why the curly braces in the definition of the ellipsoid are not rendered.)

1

There are 1 best solutions below

0
On BEST ANSWER

I believe you are confused by a notational issue. When Boyd and Vandenberghe introduce the $\succ$ notation at the beginning of section 1.6, they only say that "between symmetric matrices, it represents matrix inequality" and it is easy to misunderstand that as "componentwise inequality". It does not mean that; $A \succ 0$ is standard notation for "$A$ is positive definite", and $A \succ B$ is often used to mean $A-B$ is positive definite.

Just to reassure myself that Boyd and Vandenberghe follow this standard convention, I continued to section 2.2.5, where they write that they

use the notation $\mathbf S^n_{++}$ to denote the set of symmetric positive definite matrices: $$\mathbf S^n_{++} = \{X \in \mathbf S^n : X \succ 0\}.$$

I think this makes it clear that "$P = P^\mathsf T \succ 0$" in the definition of an ellipse simply notation equivalent to saying in words that "$P$ is symmetric and positive definite". (So the follow-up of "i.e., $P$ is symmetric and positive definite" is not meant to hide any argument, no matter how obvious - it is simply restating in words what was just said in symbols.)


Let me finish by answering the questions not related to notation.

One characterization of (symmetric) positive definite matrices is that all their eigenvalues are positive. In particular, if $P \succ 0$, then $P$ cannot be singular, because then it would have an eigenvalue of $0$, so it makes sense to talk about $P^{-1}$.

From there, if $P\mathbf x = \lambda \mathbf x$, then $P^{-1} \mathbf x = \lambda^{-1}\mathbf x$. Therefore the eigenvalues of $P^{-1}$ are just the reciprocals of the eigenvalues of $P$, and $P \succ 0 \iff P^{-1}\succ 0$.

Boyd and Vandenberghe assume that $P$ is positive definite and use $P^{-1}$ in the definition of the ellipse; meanwhile, Wikipedia assumes that $P$ is positive definite and uses $P$ in the definition of the ellipse. But as we see, if $P$ is positive definite, then so is $P^{-1}$, so the ellipses we get are the same in both cases! It's just that if an ellipse $\mathcal E$ is defined by $P$ from Boyd and Vandenberghe's perspective, then it is defined by $P^{-1}$ from Wikipedia's perspective.

Why would you do this? Well, if all you want to do is define an ellipse, then defining it as the set of points $\mathbf x$ with $\mathbf x^{\mathsf T}P \mathbf x \le 1$ is simpler. But if you read on in Wikipedia, you see that with this setup, the lengths of the semi-axes of the ellipse are given by the reciprocals of the square roots of the eigenvalues of $P$. If you use $P^{-1}$ in the definition, then the semiaxes are simply given by the square roots of the eigenvalues, which gives us a more intuitive relationship between $P$ and the ellipse.