How do you mathematically define a random line or in general, a random subspace of a Euclidean space?

264 Views Asked by At

I'm looking for a definition of random $k$-dimensional vector subspace of the Euclidean space $\mathbb{R}^d, k, d$ are fixed. I'm guessing one should start with a random basis, i.e. a set of independent random vectors $\{X_1 \dots X_k\} \in \mathbb{R}^d,$ and then use some kind of quotienting argument. But this is where I'm having a bit of problem: using the quotient construction.

EDIT I: (after seeing the first four comments): So, upon a second thought, it seems to me that one can think of a random subspace as a random variable (i.e. a measurable function $X$) that takes values in the Grassmanian manifold $Gr(d,k), X: \Omega \to Gr(d,k), 1 \le k \le d,$ of $k$-dimensional subspaces of $\mathbb{R}^d.$ So in particular, a random line would be a random variable/measure function $X: \Omega \to Gr(d,1) = RP^{d}.$

EDIT II: As mentioned in some of the comments, I need some kind of uniformity to define such a random subspace. Perhaps this is basic, but I'm not sure I completely understand what problem I'd run into if I want $X$ to be non-uniform? I mean for a general Riemannian manifold, non-uniform random variables exist...

I'd appreciate a rigorous definition or a reference here.

P.S. If we can start with a random line for now to keep things simple, that'd also be great!

2

There are 2 best solutions below

0
On

Since there's still no answer, I'll give it a try. This might not be the most elegant method to construct what you want, but at least it's something. User WhatsUp's idea with the Haar measure on the Grassmannian as a homogenous space is probably way more elegant, but I don't know enough about it to make it into a satisfactory answer.

I'll drop $\sigma$-algebras throughout. It's always either the Borel $\sigma$-algebra, since we're mostly talking about topological spaces, or the $\sigma$-algebra induced by a map.

Method 1: Span of Random Basis

We'll choose $k$ independent unit vectors at random and take their span. So we'll take a $\tilde S_k:=\underbrace{S^{d-1}\times\dots\times S^{d-1}}_{k\textrm{ times}}$ valued random variable $X=(X_1,\dots,X_k)$. Then the map $\operatorname{span}(X)$ is a random variable with values in the set $\operatorname{Sub}(d)$ of subspaces of $\mathbb R^d$. We'll have to do some technical work to make the vectors independent so $\operatorname{span}(X)$ only takes values in the set of $k$-dimensional subspaces, i.e., the Grassmannian $\operatorname{Gr}(d,k)$.

Obviously, $\operatorname{Gr}(d,k)\subseteq\operatorname{Sub}(d)$. Consider the map $\operatorname{span}:\tilde S_k\to\operatorname{Sub}(d)$. The preimage of $\operatorname{Gr}(d,k)$ under this map is exactly the set of independent $k$-tuples in $\tilde S_k$. That's what we need. So we define:

$$S_k:=\operatorname{span}^{-1}(\operatorname{Gr}(d,k)).$$

This is the set of independent $k$-tuples in $\tilde S_k$. Now we're ready to construct random $k$-dimensional subspaces:

Let $X=(X_1,\dots,X_k)$ be an $S_k$ valued random variable. Then $\operatorname{span}(X)$ is a $\operatorname{Gr}(d,k)$ valued random variable. In other words, a random $k$-dimensional subspace of $\mathbb R^d$.

Method 2: Sum of Random Subspaces

We'll choose $k$ 1d subspaces (= lines through the origin) and take their sum. So we take an $\tilde L_k:=\underbrace{\mathbb R P^{d-1}\times\dots\times\mathbb R P^{d-1}}_{k\textrm{ times}}$ valued random variable $X=(X_1,\dots,X_k)$, and then $$\operatorname{sum}(X_1,\dots, X_k):=\sum\limits_{n=1}^k X_n=\operatorname{span}(X_1,\dots, X_k)$$ is a random variable with values in $\operatorname{Sub}(d)$. We take the same technical steps to ensure that $\operatorname{sum}$ only takes on values in $\operatorname{Gr}(d,k)$: define

$$L_k:=\operatorname{sum}^{-1}(\operatorname{Gr}(d,k)).$$

This is essentially the set of all $k$-tuples of 1d-subspaces whose sum is direct. Then take an $L_k$ valued random variable $X=(X_1,\dots, X_k)$ and consider the $\operatorname{Gr}(d,k)$ valued random variable $\operatorname{sum}(X)$. This is a random $k$-dimensional subspace of $\mathbb R^d$

0
On

I will extend my comment above.

What you need is a probability measure $\mu$ on the Grassmannian $Gr(d, k)$ (endowed with the Borel $\sigma$-algebra). It is then meaningful to talk about the probability of your random subspace lying in a given open subset of the Grassmannian.

The natural choice is to view the Grassmannian as a quotient $\operatorname{GL}_d(\Bbb K)/P_{d - k, k}$, where $\Bbb K$ is your coefficient field (i.e. $\Bbb K = \Bbb R$ if you only consider real vector spaces), and $P_{d - k, k}$ is the parabolic subgroup consisting of all block upper-triangular matrices with block sizes $d - k$ and $k$ on the diagonal, i.e. $$P_{d - k, k} = \left\{\begin{pmatrix}A & B\\0 & D\end{pmatrix}:A\in\operatorname{GL}_{d - k}(\Bbb K), D \in \operatorname{GL}_k(\Bbb K)\right\}.$$

You would naturally want to define a $\operatorname{GL}_d(\Bbb K)$-invariant measure on this quotient. However, there is no such a measure, because $\operatorname{GL}_d(\Bbb K)$ is unimodular, while the parabolic subgroup $P_{d - k, k}$ isn't. (It is possible, in this case, to define a "twisted" version of Haar measure, but you will not be able to integrate it against a function on the Grassmannian, so it's not useful to us.)

However, in the case $\Bbb K = \Bbb R$, there is a way to overcome this. The point is that requiring it to be $\operatorname{GL}_d$-invariant is perhaps too strong and unnecessary. Instead, we can use Iwasawa decomposition to rewrite the Grassmannian as $O_d(\Bbb R)/(O_{d - k}(\Bbb R) \times O_k(\Bbb R))$. Now every group is compact and hence unimodular, so we may define a Haar measure on this quotient. It is $O_d(\Bbb R)$-invariant.

It should be easy to extend this method to other local fields, e.g. $\Bbb K = \Bbb C$ or $\Bbb Q_p$ or $\Bbb F_p(t)$.