Is the restriction of multivariate Gaussian PDF to one dimension a constant multiple of a one dimensional Gaussian PDF?

220 Views Asked by At

Let $X:=(X_1\dots X_k)$ be a $k$-dimensional multivariate Gaussian random vector with mean $\mu,$ covariance matrix $\Sigma,$ that's positive definite. Consider it's PDF

$p_X(x):=$

enter image description here

We know that $\forall v\in \mathbb{R}^k, \forall b \in \mathbb{R},$ the random variable $W:=<X,v>+b$ is a random variable. So this means that the translated projections of multivariate Gaussians random vectors are Gaussian random variables - this is clear.

My questions is:

What if instead of projections of $X$, we try to restrict the Gaussian PDF of $X$?

  1. Consider the function $\gamma: \mathbb{R}\to\mathbb{R}^k, \gamma(t):=ta+b, a, b \in \mathbb{R}^k,$ that represent a straight line not necessarily passing through the origin. Now let's restrict the PDF of $X$ onto this line. So consider $q:\mathbb{R}\to (0, \infty):= q(t):=p_X(\gamma(t))=p_X(ta + b).$ Intuitively by looking at the graph of $q,$ it seems to me that $q$ should represent the constant multiple of a Gaussian distribution in one dimension (thanks Kurt G. for his comment), see the screenshot of the drawing:

enter image description here

My calculations are as follows:

$-\frac{1}{2}(ta+b-\mu)^{T}\Sigma^{-1}(ta+b-\mu)=-\frac{1}{2}\left(t^2a^{T}\Sigma^{-1}a + 2ta^{T}\Sigma^{-1}b + b^{T}\Sigma^{-1}b\right) $

So, calling $P_2(t)=(ta+b-\mu)^{T}\Sigma^{-1}(ta+b-\mu)=t^2a^{T}\Sigma^{-1}a + 2ta^{T}\Sigma^{-1}b + b^{T}\Sigma^{-1}b$

So we can always write $q(t)$ as $Ce^{-\frac{1}{2}P_2(t)}, P_2(t)$ is a quadratic polynomial with positive coefficient ($a^{T}\Sigma^{-1}a$ above) of $t^2$, $C>0$ a constant. Thus $q(t)$ should be a constant multiple of a Gaussian PDF - correct me if I'm wrong? P.S. I'm not saying that this new PDF will have mean zero.

  1. If $q$ does represent the constant multiple of Gaussian say $G$ in one dimension, then is there a way to connect $G$ with $W=<X,v> + b =X^{T}v + c$ for some $v\in \mathbb{R}^k, c\in \mathbb{R}?$ Put somewhat generally, is there a way we can connect the two concepts: (i) restricting the PDF of a Gaussian random vector and (ii) projecting and translating that vector?
  1. Motivated by question 1, I'm tempted to define a multivariate gaussian as a random vector in $\mathbb{R}^k$ whose PDF, restricted to straight lines in $\mathbb{R}^k,$ are constant multiple times some real valued Gaussian PDF. Will this be a wrong alternate definition? In essence, I'm thinking that this theorem is true: let $f:\mathbb{R}^k\to [0,\infty)$ be a PDF so that $\forall a,b\in \mathbb{R}^k,$ the function $q(t):=f(ta+b)$ is proportional to some $e^{-\frac{1}{2}{P_2(t)}}, P_2(t)$ is a quadratic in $t$ with positive coefficient of $t^2.$ Then $f$ must be a Gaussian PDF in $\mathbb{R}^k.$ Is this true?
3

There are 3 best solutions below

1
On

Too long for a comment.

In my first point I meant $q(t_0-t)=q(\color{red}{t-t_0})\,.$ You need $q=p\circ\gamma$ to be a quadratic function of $t-t_0\,.$ Take $$ p(x,y)=\frac{1}{2\pi\sqrt{1-\rho^2}}\exp\Big(-\frac{x^2-2xy\rho+y^2}{2(1-\rho^2)}\Big)\,,\quad\gamma(t)={t\choose 1}\,. $$ Plugging $\gamma(t)$ into $p$ turns the numerator in the exponential into the polynomial $$ -t^2+2t\rho-1\,. $$ If there were a $t_0$ such that this takes the form $(t-t_0)^2$ it could only have one double root but this is only the case for $\rho=\pm 1\,.$

3
On
  1. Yes $q(t)$ is a positive definite one-variable quadratic, so $e^{-q(t)}$ is a multiple of a one-variable Gaussian (whose mean is generally not zero).

  2. I think that your conjecture 2 can be proved correct most easily by a somewhat indirect method: looking at the Fourier transform of the 2D Gaussian and applying the slice/projection duality properties of the Fourier transform.

https://en.wikipedia.org/wiki/Projection-slice_theorem

(This works out nicely because of this principle: the Fourier transform (and its inverse) "map Gaussians to constant multiples of Gaussians".)

3.To prove your conjecture 3 I would start by making use of various equivalent definitions of the multivariate normal: https://en.wikipedia.org/wiki/Multivariate_normal_distribution#Equivalent_definitions

and then use the results of 1 and 2 above.

4
On

Yes, the 'slice' of the Gaussian pdf taken in this way, when normalised, is the pdf of a one-dimensional Gaussian.

The line $\{at + b : t \in \mathbb{R}\}$ may be represented as the set of points $x \in \mathbb{R}^k$ satisfying $C(x-b) = 0$, where $C \in \mathbb{R}^{(k-1)\times k}$ is a matrix whose rows are a basis for the ($(k-1)$-dimensional) orthogonal complement of the span of $a$.

Now what does the (normalised) slice of the pdf of $X \sim \mathcal{N}_k(\mu,\Sigma)$ along this line represent? This would be the pdf of the distribution of $X$ conditional on $C(X-b)=0$. This event has zero probability, but we make sense of it through Gaussian regression.

Since $X$ is multivariate Gaussian, the random vector $(X,Y) := (X,CX) \in \mathbb{R}^k \times \mathbb{R}^{k-1} \cong \mathbb{R}^{2k-1}$ is normally distributed with mean $(\mu,C\mu)$ and covariance matrix \begin{pmatrix} \Sigma & \Sigma C^T \\ C \Sigma & C\Sigma C^T. \end{pmatrix} Gaussian regression says that $X \overset{d}{=} AY + Z$, where $A \in \mathbb{R}^{k \times (k-1)}$ is some constant matrix and $Z \sim \mathcal{N}_k(\eta,\Xi)$ is some Gaussian independent of $Y$. Some algebra allows us to determine these quantities in terms of $C$ and $\Sigma$: \begin{align} A &= \Sigma C^T ( C \Sigma C^T )^{-1} \\ \eta &= (I - \Sigma C^T (C\Sigma C^T)^{-1} C) \mu \\ \Xi &= (I - \Sigma C^T (C\Sigma C^T)^{-1} C) \Sigma^T \end{align} ($C\Sigma C^T$ is positive definite since $\Sigma$ is, hence it is invertible). So the conditional distribution of $X$ given $C(X-b)=0$ is that of $ACb + Z$, which is a normal distribution.

As a sanity check, one might want to make sure that the resulting distribution for $ACb+Z$ that we have obtained is independent of the choice of $C$. It is sufficient to check that $Q(C):=C^T(C \Sigma C^T)^{-1} C$ is independent of $C$. Different choices of $C$ are related to each other via $C' = RC$, where $R \in GL_k(\mathbb{R})$, and so it is not difficult to see that $Q(C)=Q(C')$.

Now this is what we get when we 'keep' the slice in the space $\mathbb{R}^k$. Its density when viewed as an $\mathbb{R}^k$-valued random variable will be degenerate. However, there is an affine map $P : \mathbb{R}^k \rightarrow \mathbb{R}$ for which points $x$ on $\gamma$ will be mapped to the unique $t$ for which $at+b=x$. One such map would be to send $v \mapsto v-b$, then apply a rotation which sends $\gamma$ to $\mathbb{R}$, apply the appropriate scaling and then project onto the first coordinate; such a map is not unique. After applying $P$ to $ACb+Z$, we will get a real-valued random variable whose density is $q/||q||_1$.

Since we have obtained this random variable only through affine transformations and Gaussian regression, it is also Gaussian. Hence $q/||q||_1$ is the pdf of a Gaussian random variable.