Proving that Mahalanobis norm is a norm indeed

4.4k Views Asked by At

While reading this thread I wanted to prove that Mahalanobis norm $\lVert{x-y}\rVert_m$ is a norm indeed. The norm is defined like so: $\lVert{x-y}\rVert_m=d(x-y,0)=d(x,y)=\sqrt{(x-y)^T S (x-y)}=\sqrt{(x-y,x-y)_m}$, where $(x,y)_m$ is the Mahalanobis inner product, and $(x,y)_m=x^T S y$. To proceed, I need to demonstrate the properties that every norm should satisfy, they are:

  1. $\lVert{a x}\rVert_m = |a| \lVert{x}\rVert_m$, for $a \in \mathbb{R}$
  2. $\lVert{x}\rVert_m = 0 \iff x=0$
  3. $0 \le \lVert{x}\rVert_m$
  4. $\lVert{x+y}\rVert_m \le \lVert{x}\rVert + \lVert{y}\rVert_m$

Although I was able to prove 1-3, I haven't been able to solve triangle inequality. Using $\lVert{x+y}\rVert_m^2 \le (\lVert{x}\rVert_m + \lVert{y}\rVert_m)^2$ I came up with a expression similar to Holder's inequality:

$$ y^T S x + x^T S y \le 2 {\sqrt{x^T S x}} {\sqrt{y^T S y}}, $$

From there I can argue that triangle's inequality is true because it became Holder's inequality, which is true too (I can cite a reference or dig deeper to prove Holder's inequality). But, how can I make the jump to Holder's inner product notation $|(x,y)_m|\le\|x\|\|y\|$ without saying that $\lVert{x}\rVert_m^2=(x,x)_m$? (the inner product is induced by the norm, but I haven't proved that yet, so I can't use it). Also, do I need to make any assumptions regarding the invertibility of the covariance matrix $S$ to ensure all this reasoning makes sense?

1

There are 1 best solutions below

1
On BEST ANSWER

Let $S$ be a symmetric $n\times n$ matrix. The Mahalanobis norm is defined like so: $$ \|x\|_S=\sqrt{x^TSx}. $$ Necessarily, for $\|\cdot \|_S$ to be a norm, $S$ has to be positive-definite. By spectral theorem for symmetric matrices (see here or here), there are a diagonal $n\times n$ matrix $\Lambda=\mathrm{diag}(\lambda_1,\lambda_2,\ldots,\lambda_n)$ and an orthogonal $n\times n$ matrix $Q$ (i.e. $ Q^TQ=I$), such that $Q^T=Q^{-1}$ and: $$ S=Q^T\Lambda Q. $$ By definition, the matrix $S$ is positive-definite. Then: \begin{align*} \lambda_1 > 0 \\ \lambda_2> 0 \\ \ldots\\ \lambda_n> 0. \end{align*} Let the matrix: $$ U=\mathrm{diag}(\sqrt{\lambda_1},\sqrt{\lambda_2},\ldots,\sqrt{\lambda_n}) \, Q, $$ note that: $$ S=U^TU. $$ Set $\tilde{x}=Ux$ and $\tilde{y}=Uy$. Let $\|v \|_E=\sqrt{v_1^2+v_2^2+\ldots+v_2^2}$ the usual euclidean norm. Then: \begin{align*} \| x \|_S= \|\tilde{x}\|_E && (\ast)\\ \| y \|_S= \|\tilde{y}\|_E && (\ast\ast)\\ \|x+y \|_S= \|\tilde{x}+\tilde{y}\|_E. &&(\ast\ast\ast)\\ \end{align*} By usual triangular inequality we have : $$ \|\tilde{x}+\tilde{y}\|_E \leq \|\tilde{x}\|_E +\|\tilde{y}\|_E. $$ By equalities $(\ast)$, $(\ast\ast)$ and $(\ast\ast\ast)$ we have: $$ \|x+y \|_S \leq \| x \|_S+\| y \|_S $$