Computing the "Mean Value" of a Point Sample From an Arbitrary Manifold

564 Views Asked by At

A friend of mine noticed that taking the "mean" of two points on the circle isn't as easy as just computing the arithmetic mean of their arguments: If one point has argument $-3.13$ radians and one point has argument $3.13$, the mean value of the arguments is zero, while it would be more intuitive to have the "mean" around (roughly) $3.14$. Of course, zero is also a suitable mean value for good reasons, but $3.14$ feels closer visually, because it minimizes the distances to the sample points. Also, note that means may not be unique, as two antipodal points have two visually appealing candidates for means.

Needless to say, this problem can be solved algorithmically, which I have done with an ad-hoc $n\log(n)$ algorithm, but that is out of scope for this post (so I'll pull a Fermat and leave that as an exercise for the ambitious reader).

What I wonder is instead how one can go about generalizing this notion of "taking means" to points sampled from arbitrary manifolds. By manifolds, I mean Riemannian manifolds specifically, to have a local metric available. If any other type of manifold would work just as well or better, feel free to comment. So what I want is some way to, given a sample of points $\{p_1,\ldots,p_n\}$ on a manifold $M$, compute another point $p \in M$ that minimizes the sum of the deviations between the actual sample and $p$, where the deviations are computed as (oriented?) geodesic distances along M. Or something of the sort, at least, sorry if this is a vague formulation.

Intuitively, I can imagine a lot of nonlinear optimization, optimal control theory and possibly calculus of variations coming into play, but I can only speculate at this point. Are there any references out there where I can read up on how to compute these "means"?

1

There are 1 best solutions below

0
On BEST ANSWER

It turns out that there indeed exists such a notion of mean, see these slides by Fletcher for example: http://www2.imm.dtu.dk/projects/manifold/Pres/fletcher.pdf

To summarize the most important details (and avoid comments about having to do so), we can define the intrinsic mean as the point on the manifold minimizing the sum of the squared distances to the sample, or with a formula, for $M$ being a manifold and $\mu$ being the intrinsic mean: $$\mu = \arg\min _{p\in M} \sum_{i=1}^n d(p_i,p)^2,$$ where $d$ is distance along $M$ and $p,p_i$ follow the notation in the question. I suppose the main motivation for this definition is that we want a mean with similar properties as the usual arithmetic mean in the case of Euclidean spaces, and one of those is the mean-square distance minimization property.

Should this mean point exist on the manifold, you can use Gradient Descent to compute that mean, as follows:

  1. $\mu_0 := p_1$
  2. While not satisfied:
  3. $\ \ \ \ \Delta \mu := \frac{1}{n}\sum_{i=1}^n \text{Log}_{\mu_k}p_i$
  4. $\ \ \ \ \mu_{k+1} := \text{Exp}_{\mu_k}(\Delta \mu),$

where Exp and Log are the usual Exponential map and its inverse from Riemannian geometry.

We can also define the geometric median to be the point that minimizes the sum of distances, without squaring the distances in the sum: $$m = \arg\min _{p\in M} \sum_{i=1}^n d(p_i,p)$$

Also in this case Gradient Descent will work out, and may be performed as in the Weiszfeld Algorithm:

  1. Choose starting point $m_0$
  2. While not satisfied:
  3. $\ \ \ \ v_k := \sum_{i\in I_k} \frac{\text{Log}_{m_k}(p_i)}{d(m_k,p_i)} \Big/ (\sum_{i\in I_k} d(m_k,p_i)^{-1})$
  4. $\ \ \ \ m_{k+1} := \text{Exp}_{m_k} (\alpha v_k)$

Above $\alpha$ is the step size, and $I_k=\{p_i | p_i\neq m_k\}$, i.e. we take away the points where division by zero would otherwise occur.

As for existence and uniqueness of the two notions defined above, they both exist in theory and are unique if either:

  1. the sectional curvatures of $M$ are bounded above by $\Delta>0$ and $\text{diam}(U)< \frac{\pi}{2\sqrt\Delta}$, or
  2. the sectional curvatures of $M$ are nonpositive,

where $U$ is some assumed convex subset of $M$ containing the sample.

As for finding them, the Manifold version of the Weiszfeld Algorithm presented above converges to the theoretical median if the sectional curvatures of $M$ are nonnegative, the existence/uniqueness conditions hold, and $0<\alpha\leq 2$.

For further information, the slides linked to above contain a few references.