I’m studying linear algebra, and for conic sections we studied that we can get the canonical form of a conic without having to explicitly calculate the transformations and applying them on the equation of the conic.
- Note: the way we do it is by using orthogonal invariants and their correlation with eigenvalues: if you’re familiar with this method you can skip the first section, which explains it.
METHOD FOR FINDING DIAGONAL/CANONICAL FORM OF CONIC WITH ORTHOGONAL INVARIANTS
Let $Γ$ be a given conic with an associated $3x3$ matrix $$\overline{A}= \begin{pmatrix}A & \underline{b}\\\ \underline{b}^t & c \end{pmatrix} $$ where $A$ is the $2x2$ matrix of the terms of degree 2, $\underline{b}$ is the $2x1$ vector of the linear terms and $c\in\mathbb{R}$ is the constant term. Its diagonal/canonical form $Γ’$ (except if it’s a parabola, but I will get to that later) will have a canonical matrix associated $$\overline{A’}= \begin{pmatrix}α & 0 & 0\\\ 0 & β & 0 \\\ 0 & 0 & γ\end{pmatrix} $$ where $α$, $β$, $λ$ are the eigenvalues of $\overline{A’}$ (in particular $α$ and $β$ are the eigenvalues of $A’$).
We know that we can diagonalize a conic just with rototranslations (which are isometries), and we also know that $tr(A)$, $det(A)$, $det\left(\overline{A}\right)$ don’t change if we apply a rototranslation on the conic $Γ$ associated to $\overline{A}$ (they’re called ortogonal invariants). Thus,
$$ \begin{cases} tr(A)=tr(A’) \\ det(A)=det(A’) \\ det(\overline{A})= det(\overline{A’}) \end{cases} $$
But since $tr(A’)=α+β$,$det(A’)=αβ$, $det\left(\overline{A’}\right)=αβγ$, then we have
$$\begin{cases} tr(A)=α+β \\ det(A)=αβ \\ det(\overline{A})=αβγ \\ \end{cases}$$
And we can calculate the eigenvalues (and thus, the canonical form $Γ’$ of $Γ$) just by calculating the ortogonal invariants of $A$ and solving the system above, without having to do all the long calculations that we would do by applying the traslation and the transformation.
Q1
Let $$\overline{B’}= \begin{pmatrix}1 & 0 & 0\\\ 0 & 0 & -b \\\ 0 & -b & 0\end{pmatrix}$$ be the canonical form of a parabola, or, as an equation, $x^2-2βy=0$. Note that $b$ is not an eigenvalue of neither $B'$ or $\overline{B'}$, so it's not a greek letter (eigenvalues are denoted with greek letters in this question).
Is it correct to say that, given a general equation of a conic $\Gamma$ (with associated matrix $\overline{B}$ and quadratic part $B$) that we know is a parabola, we can find its canonical form by solving the system:
$$\begin{cases} tr(A)=α \\ det(A)=0 \\ det(\overline{A})=-b^2 \\ \end{cases}$$
METHOD FOR FINDING DIAGONAL/CANONICAL FORM OF QUADRIC WITH ORTHOGONAL INVARIANTS
- Note: as for conic sections, if you're familiar with orthogonal invariants, just skip this.
If, in a similar way to conics, $\overline{A}$ is the $4x4$ matrix associated to a quadric $Γ$ and $A$ is the $3x3$ matrix of its quadratic terms, then the orthogonal invariants are
- $tr(A)$
- $det(A)$
- $c$=the sum of the principal minors of order 2 of $A$
- $det(\overline{A})$
If the diagonal form $\overline{A’}$ (associated to $Γ’$, obtained with rototranslations from $Γ$) is $diag(α, β, γ, δ)$, then $$\begin{cases} tr(A) =α+β+γ \\ c = αβ + βγ + αγ\\ det(A) = αβγ \\ det(\overline{A})=αβγδ \end{cases} $$
But paraboloids and cilinders have canonical forms that contain linear terms, so we have to find some coefficients that are not eigenvalues.
Q2
For paraboloids, which should have canonical form $2bz=\alpha x^2+\beta y^2$:
$$\overline{A}= \begin{pmatrix}\alpha & 0 & 0 & 0 \\\ 0 & \beta & 0 & 0\\\ 0 & 0 & 0 & -b\\\ 0 & 0 & -b & 0\end{pmatrix} $$
the system to find the canonical form should be
$$\begin{cases} tr(A) =α+β \\ c = αβ\\ det(A) = 0 \\ det(\overline{A})=-b^2αβ \end{cases} $$
Is it correct?
Q3
For cilinders with a parabola as a cross section, which should have canonical form $2bz=\alpha x^2$, but we can divide for b (since $b=0$) and it becomes $2z=\alpha ' x^2$:
$$\overline{A}= \begin{pmatrix}\alpha ' & 0 & 0 & 0 \\\ 0 & 0 & 0 & 0\\\ 0 & 0 & 0 & -1\\\ 0 & 0 & -1 & 0\end{pmatrix} $$
the system to find the canonical form should be
$$\begin{cases} tr(A) =α' \\ c = 0\\ det(A) = 0 \\ det(\overline{A})=0 \end{cases} $$
Is it correct?
- Note that Q2 and Q3 is the analogous of Q1, but for quadrics: I'm trying to apply a similar method to the one showed in Q1 to find the canonical form of paraboloids and cilinders with parabola as a section, without having to do all the long calculations that I would do by applying the transformations
LAST DOUBT
I’m gonna take the paraboloid as an example.
Is it the same to take, as a canonical form, ($case (1)$) the equation $$2bz=\alpha x^2+\beta y^2 \iff \overline{A}= \begin{pmatrix}\alpha & 0 & 0 & 0 \\\ 0 & \beta & 0 & 0\\\ 0 & 0 & 0 & -b\\\ 0 & 0 & -b & 0\end{pmatrix} \iff \begin{cases} tr(A) =α+β \\ c = αβ\\ det(A) = 0 \\ det(\overline{A})=-b^2αβ \end{cases} (1) $$
and then dividing the equation by $b$ or ($case (2)$) using, as a canonical form, the equation $$ 2z=\alpha x^2+\beta y^2 \iff \overline{A}= \begin{pmatrix}\alpha & 0 & 0 & 0 \\\ 0 & \beta & 0 & 0\\\ 0 & 0 & 0 & -1\\\ 0 & 0 & -1 & 0\end{pmatrix} \iff \begin{cases} tr(A) =α+β \\ c = αβ\\ det(A) = 0 \\ det(\overline{A})=-αβ \end{cases} (2) $$.
Do both methods bring to the same result? If not, is it that they’re both correct canonical forms but they’re different, or is it that the first needs still to be transformated, or does the first bring to a wrong paraboloid?
And does this apply also, for example, for using the canonical form $αx^2+βy^2+γz^2=δ$ instead of $αx^2+βy^2+γz^2=1$ for ellipsoids?
In general your equations are correct. A general approach to this problem is the following.
Preliminary observations
When you change basis on an $n$-dimensional real vector space, there is an invertible matrix $P$ (the so called "change of basis matrix") which expresses the relation between "old" and new coordinates:
$$\mathbf x=P\mathbf{x'}$$
for $\mathbf x=(x_1,\dots,x_n)\in\mathbb R ^n$.
Explicitly,
$$ \begin{cases} x_1=p_{11}x_1'+\dots+p_{1n}x_n' \\ \vdots \\ x_n=p_{n1}x_1'+\dots+p_{nn}x_n' \end{cases} $$
This fact is called a linear change of coordinates. Notice that in this way the $n$-tuple $(0,\dots,0)$ corresponds to the same point with respect to both coordinates. In geometric terms, this change of coordinates keep fixed a point. In order to allow also translations, we take into account affine change of coordinates, of the form
$$\mathbf x=P\mathbf{x'}+\mathbf v \qquad (*)$$
for $P$ an invertible $n\times n$ real matrix and $\mathbf v\in \mathbb R^n$
Explicitly,
$$ \begin{cases} x_1=p_{11}x_1'+\dots+p_{1n}x_n'+v_1 \\ \vdots \\ x_n=p_{n1}x_1'+\dots+p_{nn}x_n' +v_n \end{cases} $$
Notice moreover that if we accept the convention of representing any point of coordinates $\mathbf x=(x_1,\dots,x_n)$ with the $(n+1)$-tuple $\overline{\mathbf{x}}=(x_1,\dots,x_n,1)$, then (*) can be written as
$$\overline{\mathbf{x}}=\overline{P} \overline{\mathbf{x}'} $$
where $\overline P=\begin{pmatrix} P & \mathbf v \\ \mathbf 0 & 1 \end{pmatrix}$
(if you compute the row-column product you will see the equivalence).
Affine orthogonal equivalence of quadrics
Consider a quadric in $\mathbb R^n$, defined by (with your notation):
$$\begin{pmatrix} x_1 & \dots & x_n & 1 \end{pmatrix} \begin{pmatrix} A & \mathbf b \\ \mathbf b^T & c \end{pmatrix} \begin{pmatrix} x_1 & \dots & x_n & 1 \end{pmatrix} $$
In shorter notation: $\overline{\mathbf x}^T \overline A \overline{\mathbf x}$
After an affine change of coordinates of the form $\overline{\mathbf{x}}=\overline{P} \overline{\mathbf{x}}' $
the equation becomes
$$\overline{\mathbf x '}^T\overline{P}^T \overline A \overline P \overline{\mathbf x '}$$
i.e. the matrices $\overline A$ and $\overline A'=\overline{P}^T \overline A \overline P$ define the same quadric up to a suitable affine change of coordinates
However, we are interested in some specific affine changes of coordinates, those that preserve metric properties ("rigid motions"). This is accomplished requiring the matrix $P$ to be not only invertible but also orthogonal.
Therefore we will consider equivalent the quadrics defined by $\overline A$ and $\overline{P}^T \overline A \overline P$ if $\overline P$ is of the form:
$$ \overline P= \begin{pmatrix} P & \mathbf v \\ 0 & 1 \end{pmatrix} $$
with $P$ an $n\times n$ orthogonal matrix (you could make further distinction in requiring $P$ special orthogonal... but the key facts are the same).
In particular, we look for canonical forms, i.e. a set of quadrics such that any other is equivalent in the sense above specified to one of the canonical ones.
Let's see how to find them:
By the spectral theorem applied to the symmetric matrix $A$, we can choose an orthonormal basis (hence an orthogonal change of basis matrix $P$) such that $A'$ is diagonal (i.e. $A'=\mathrm{diag}(\lambda_1,\dots,\lambda_n)$). Moreover, completing the squares (I skip the details, but I can add them if you wish), we can cancel with suitable traslations the linear terms associated to the coordinates such that $\lambda_i\neq 0$ (e.g. in $x^2+y^2-2x=0$ we write $(x-1)^2+y^2-1=0$...). If both the linear and constant terms disappear, we are done (type (i)). If only the linear terms disappear but not the constant one, we normalize the constant $c$ to 1 (type (ii)). If some linear term survives (consider e.g. $x^2-y=0$) and thus at least one $\lambda_i$ is zero, we can still manage to change the coordinates in such a way that there is only one linear term and no constant (type (iii)).
Therefore, for any symmetric $(n+1)\times (n+1)$ matrix $\overline A$ defining a quadric, there is an orthogonal affine change of coordinates such that the quadric is represented by $\overline A'$ which has one of the following forms
(type (i)) $\begin{pmatrix} \lambda_1 & & & \\ &\ddots & & \\ & & \lambda_n & \\ & & & 0 \end{pmatrix} $
(type (ii)) $\begin{pmatrix} \lambda_1 & & & \\ &\ddots & & \\ & & \lambda_n & \\ & & & 1 \end{pmatrix} $
(type (iii)) $\begin{pmatrix} \lambda_1 & & & &\\ &\ddots & & & \\ & & \lambda_{n-1} & & \\ & & & 0 & 1 \\ & & & 1 & 0 \end{pmatrix} $
with $\lambda_i\in\mathbb R$ possibly equal to 0.
In the case $n=2$ we have the usual canonical forms of conic curves: type (i) corresponds to degenerate conics, type (ii) to ellipses and hyperbolas, type (iii) to parabolas (assuming $\lambda_i\neq 0$ in the last two cases, otherwise they reduce to degenerate conics).
Orthogonal invariants
It turns out that some quantities that depend on the matrix $\overline A$ are invariant under orthogonal affine transformations. Therefore they can be used to compute the coefficients of the canonical form exactly as done in the question.
In particular
Proof: If $P$ is orthogonal, $P^T A P=P^{-1}AP$, thus it is similar to $A$ and they have the same characteristic polynomial.
This implies that the coefficients of the characteristic polynomial concide. In the case $n=2$ they are determinant and trace. For bigger values of $n$, the other coefficients are expressible as suitable functions of the principal minors (as done in the question).
Proof: $\det(\overline{P}^T \overline A \overline P)=\det(\overline P ^T)\det(\overline A)\det(\overline P)=\det(\overline A)$ since $\det(\overline P)=\det(P)=1$.
Therefore you can always use those invariants in order to find the canonical form of the quadric.
A possible way, if you already know that the quadric is non degenerate (i.e. $\det(\overline A)\neq 0$), is the following:
If $\det(A)=0$ (this is the case for parabolas and paraboloids), the canonical form is of type (iii), and the $n-1$ coefficients $\lambda_i$ can be found using $\det(\overline A)$ and the coefficients of characteristic polynomial. For conics they are determinant and trace, for quadrics in three dimensional space you consider also the sum of principal $2\times 2$ minors (that is what you called $c$) and for bigger $n$ you compute the other coefficients in a similar fashion).
If $\det(A)\neq 0$, the canonical form is of type (ii), and the $n$ coefficients $\lambda_i$ can be found using the coefficients of characteristic polynomial and $\det(\overline A)$.
The practical way is exactly the one you perfomed in your examples. However you sometimes employed canonical forms that are not completely canonical, e.g. for the paraboloid, where your coefficient $b$ can be normalized to 1.
For further reference, you can look at Berger, Geometry II, chapters 13 and 15.