Geometric intuition of concepts relating to an (almost) inner product

89 Views Asked by At

$\def\RR{\mathbb{R}}$

In page 2 of Gehring & Halmos' General Theory of Relativity for Mathematicians the following definitions are made:

Definition: an inner product $g$ is a function $V^2\to \RR$ such that

  1. $g$ is bilinear.
  2. $g$ is symmetric.
  3. $g$ is nondegenrate, meaning that for any non-zero $x$ there is a $y$ such that $g(x,y)\ne 0$.

*The third condition is weaker than the usual positive-definiteness needed to make $g$ a proper inner product.

Definition: letting $$S=\{W|W \text{ is a subspace of } V \text { and } g|_W \text{ is negative definite}\}$$ we define the index $I$ of $g$ as the integer $$I:=\max_{W\in S} \dim W.$$

Definition: a basis $B=\{e_1,\ldots,e_N\}$ of $V$, with dual basis $\{e^1,\ldots,e^N\}$, is called orthonormal (with respect to the inner product $g$) iff $$g = \sum_{a=1}^{N-I}e^a\otimes e^a - \sum_{a=N-I+1}^N e^a\otimes e^a$$ where the appropriate sum is zero if $I=0$ or $I=N$. Equivalently, we say $B$ is orthonormal iff \begin{split} g(e_a,e_b) & = 0 \text{ if } a \ne b\\ g(e_a,e_a) & = \begin{cases} 1 & \text{ if } 1\le a \le N-I\\ -1 & \text{ if } N-I+1 \le a \le N\\ \end{cases}\\ \end{split}


What geometrical intuition may be given to the concepts just defined (inner product, index, and orthonormal basis)?

I understand that, since $g(x,y)$ may be negative, we may be interested in the 'largest' subspace of $V$ which makes $g$ negative definite, and that, when defining the analogue of an orthonormal basis, the best we can hope for is $g(e_a,e_a)=\pm 1$, yet I remain with little intuition as to what these notions can mean geometrically.

3

There are 3 best solutions below

0
On

Consider the following setting. Let $\mathcal{E}=(e_1,\dots,e_N)$ be a basis of $V$. Take $v,w$ vectors of $V$ with coefficients $$\overline{v}=\begin{pmatrix}v_1\\ \vdots\\v_N\end{pmatrix}\text{ and }\overline{w}=\begin{pmatrix}w_1\\ \vdots\\w_N\end{pmatrix}$$ in the basis $\mathcal{E}$. Then define the matrix $M$ by its coefficients $m_{ij}:=g(e_i,e_j)$. Then you will have the following relation: $$g(v,w)=\,^t\overline{v}\cdot M\cdot \overline{w}.$$ Since $M$ is symmetric, you know that if you choose correctly your basis $\mathcal{E}$, $M$ will be a diagonal matrix. Since $g$ is nondegenerate, you can check that its coefficients are nonzero, and up to scaling the vectors of $\mathcal{E}$ you know that you are able to make them $+1$ (let's say for $m_{ii}, 1\leq I\leq N-I$) or $-1$ (for $m_{jj}, N-I+1\leq j\leq N$). Then, what will be the form of $g$ once you work with this basis? Take $x$ a vector with coordinates $$\overline{x}=\begin{pmatrix}x_1\\ \vdots\\x_N\end{pmatrix}$$ in $\mathcal{E}$. Then: $$g(x,x)=\,^t\overline{x}\cdot M\cdot \overline{x}=\begin{pmatrix}x_1& \dots&x_N\end{pmatrix}\begin{pmatrix} \mathrm{Id}_{N-I}&0\\ 0&-\mathrm{Id}_{I} \end{pmatrix}\begin{pmatrix}x_1\\ \vdots\\x_N\end{pmatrix}=x_1^2+\dots+x_{N-I}^2-x_{N-I+1}^2-\dots-x_N^2.$$ Thus, geometrically speaking, the index is the number of directions in which the quadratic form $$x\mapsto g(x,x)$$ looks like minus the squared function once computed trough an orthonormal basis (moreover, in the others directions of this basis, it looks like plus the squared function). Now, let's suppose $V$ is a normed space, take a function $f:V\to \mathbb{R}$ of class $C^2$, and consider $a\in V$. Then you will have the Taylor formula $$f(a+x)=f(a)+Df_a(x)+D^2f_a(x,x)+o(||x||)$$ where $Df_a$ is the derivative of $f$ at $a$ and $D^2f_a$ is the second derivative of $f$ at $a$, namely the bilinear and symmetric form given by $D^2f_a(u,v)=\partial_v|_a(\partial_u f)$ where $$\partial_w|_bf:=\lim\limits_{t\to 0}\frac{f(b+tw)-f(tw)}{t}$$ and $\partial_u f$ is the function $a\mapsto\partial_u|_af$. If $a$ is a critical point (i.e. $Df_a=0$) and nondegenerate (i.e. $D^2f_a$ is nondegenerate as in your definition), then we know the local behavior of $f$ around $a$, since $D^2f_a$ is in some sense the best second-order approximation we can get: it will be up to some change of coordinates and negligible terms a sum of minus and plus squared functions. This local description is one of the interests of studying inner products: such functions with only nondegenerate critical points are called Morse functions, and are studied for the good properties this local description allows to get (you can even get rid of the negligible terms I was talking about by the mean of the Morse lemma, for example).

0
On

Usually these are just called "nondegenerate symmetric bilinear forms". In the case of $\mathbb R$ the theory of symmetric bilinear forms is the same as the theory of quadratic forms (functions $f : V \to \mathbb R$ such that $f(\alpha v) = \alpha^2f(v)$ for scalar $\alpha$ and such that $(v,w) \mapsto f(v + w) - f(v) - f(w)$ is a bilinear form), and $V$ together with a nondegenerate quadratic form is often called a quadratic space.

The signature of a symmetric bilinear form $B$ is $(p,q,r)$ where $p + q + r = N$ and we may write any orthogonal basis as $$ \{e_1,\dotsc,e_p,f_1,\dotsc,f_q,h_1,\dotsc,h_r\},\quad B(e_i, e_i) > 0,\quad B(f_i, f_i) < 0,\quad B(h_i, h_i) = 0. $$ $B$ is nondegenerate iff $r = 0$, and the signature of $g$ in your notation is $(N-I, I, 0)$. However, here is a very important point: the restriction of $g$ to a subspace of $V$ can be degenerate, even if $g$ is nondegenerate.

The "index" as you define it is not all that important to my knowledge. There isn't much difference between $g$ and $-g$, so it can't be important. What is important is the interplay between the "positive" and "negative" "parts" of $g$ (speaking loosely). This manifests itself chiefly in the existence of (nonzero) isotropic vectors: vectors $v \ne 0$ such that $g(v,v) = 0$. For example, if $u, w$ are vectors such that $g(u, u) = -g(w, w) \ne 0$ then $u + w$ and $u - w$ are both isotropic. In fact, $g$ is positive or negative definite iff there are no nonzero isotropic vectors.

As for intuition, I think we study these sorts of $g$ precisely because the intuition is almost no different from the positive-definite case; really what the indefiniteness does is give us a richer set of objects to discuss at the price of our intuition cracking just a little bit. For instance,

  1. That $\{e_i\}_{i=1}^N$ is an orthogonal basis for $g$ means that each direction $e_i$ is "completely independent" from the other directions $e_j$ with $j \ne i$, and for any vector $v$ we can easily say that $g(v, e_i)$ is "how much $v$ lies in the direction $e_i$".
  2. An isometry is a function $f : V \to V$ such that $g(f(v), f(w)) = g(v, w)$. All isometries are necessarily linear and invertible (and hence are also called orthogonal transformations), and just like in the Euclidean case all isometries are compositions of at most $n$ reflections (this is the Cartan–Dieudonné theorem).
  3. We are not defining "reflection" in any strange way: a reflection $r$ is defined by a anisotropic vector $w$ via $$ r(v) = v - \frac{g(v,w)}{g(w,w)}w. $$ The "anisotropic" part is crucial; if $w$ were isotopic then not only would the above not be defined, but then $w$ would not have a unique orthogonal (hyper)plane (except in some edge cases that are undesirable anyway).

A "rotation" is often defined to be a composition of two reflections, and is defined by two vectors via the above; call their span $P$. Consider the case of special relativity $$ g((t_1,x_1,y_1,z_1), (t_2,x_2,y_2,z_2)) = t_1t_2 - x_1x_2 - y_1y_2 - z_1z_2. $$ The restriction $g|_P$ is itself a symmetric bilinear form and so we can classify $P$ by the signature of this form:

  • When the signature is $(0, 2, 0)$ then the rotation is a spatial rotation, the kind you are most familiar with. This can be thought of as mixing two spatial dimensions.
  • When the signature is $(1, 1, 0)$ then the rotation is a Lorentz boost. This can be thought of as mixing a space and a time dimension.
  • We cannot have $(2, 0, 0)$ or $(0, 0, 2)$ because the signature of $g$ is $(1, 3, 0)$ (convince yourself of this!), and we cannot have $(1, 0, 1)$ or $(0, 1, 1)$ because the two vectors in the definition of $P$ are required to be anisotropic. The lack of the latter two mean precisely that you cannot achieve light speed.

Note that because we are discussing a vector space $V$ this is all relative to an origin, i.e. relative to a particular observer at a particular instant on their world line, and this "origin event" is unaffected by any of the above rotations.

Let's return to (1); this is perhaps more subtle than it seems. While "components" in this sense are well-defined for an orthogonal basis, they are not well-defined in general. Indeed,

  • Every anisotropic vector can be extended to an orthogonal basis, but no isotropic vector ever can. This fact is equivalent to the nondegeneracy of $g$. (Convince yourself of this!)

It is precisely in this sense that a vector $v$ has a component in any anisotropic direction but has no well-defined component in any isotropic direction. To be a little more concrete, note that the projection $$ v \mapsto \frac{g(v,w)}{g(w,w)}w $$ is ill-defined when $w$ is isotropic (though curiously $v \mapsto g(v,w)w$ is well-defined and is usually nontrivial).

0
On

Maybe I can help in a non-rogourus way: An "almost" inner is product is a mapping $\langle\cdot,\cdot\rangle:V\times V\to \Bbb K$ where $\Bbb K$ is the field of the vector space, and $\exists v\in V, v\neq \mathbf{0}, |v|=0$; this $v$ is a linear combination of the basis of $V$, which the most classic example is an orthonormal basis with negative and positive norm. This $v$'s form a non-trivial kernel for this inner product when fixed one of the vectors which is acting.

This kind of vectors are called "null" vectors ($e_0$) and can be contructed as the sum of a positive norm basis $e_+$ and a negative norm basis $e_-$:

$$e_0=e_+ +e_- $$

The geometric intuition behind this null vectors is that they divide space into regions with different qualities, like used in Special Relativity. Also they can be thought as the eigenbasis of a hyperbolic rotation.

Now, for the rest of your question, the index $I$ just is a value that separate and order the set of basis vectors of $V$; if $I=2$ and $\dim(V)=5$, then you have $2$ negative norm vector, and $3$ positive norm vector; if $B$ is the set of basis, then $B=\{e_1,e_2,e_3,e_4,e_5\}$ and $\{e_1,e_2,e_3\}$ have positive norm while $\{e_4,e_5\}$.