Showing that mean of vectors minimizes the sum of the squared distances.

Question

Showing that mean of vectors minimizes the sum of the squared distances.

4.1k Views Asked by Bumbble Comm At 31 Mar 2026 - 2:20

Let $S=\{x_1,...,x_n\}$ be a set of vectors in $\mathbb{R}^d$. Now we have to pick a vector $\mu$, such that the following expression is minimized:

$$ L(\mu)=\sum_{x\in S} ||x-\mu||_2^2. $$

I think that the best option to minimize $L(\mu)$, would be to use the mean of the vectors as $\mu$. However, I can't prove why this is so.

Here's a complete answer that I solved thanks to the help of David K.

$$ \begin{align*} \mu^*&= \arg\min_\mu \sum_{x\in S} ||x-\mu||_2^2 \\ &= \arg\min_\mu \sum_{x\in S} <x-\mu,x-\mu> \\ &= \arg\min_\mu \sum_{x\in S} \big(<x,x> -2<x,\mu>+<\mu,\mu>\big) \\ &= \arg\min_\mu n<\mu,\mu>-2\sum_{x\in S}<x,\mu> \\ &= \arg\min_\mu n<\mu,\mu>-2n\left<\frac{1}{n}\sum_{x\in S}x,\mu\right> \\ &= \arg\min_\mu <\mu,\mu>-2<\overline{x},\mu> \\ &= \arg\min_\mu <\mu,\mu>-2<\overline{x},\mu> + <\overline{x},\overline{x}> \\ &= \arg\min_\mu <\mu-\overline{x},\mu-\overline{x}> \\ &= \arg\min_\mu ||\mu-\overline{x}||_2^2. \end{align*} $$ The $||\cdot ||_2$ can never be smaller than $0$. Therefore, choosing $\mu=\overline{x}$ minimizes the expression, as $||\mu-\overline{x}||_2^2$ becomes $0$. Hence, $\mu^*=\overline{x}$.

Original Q&A

There are 2 best solutions below

Bumbble Comm On 01 May 2016 - 3:53

Since $S = \{x_1, ..., x_n\}$ is finite let $L(\mu) = \sum_{i = 1}^{n} ||x_i-\mu||_2^2$, which is a function from $\mathbb{R}^d$ to $\mathbb{R}$. In the following I will use upper indices to denote the respective components of the vectors involved, e.g. $x_1 = (x_1^1, x_1^2, ... x_1^d)$. Calculating the partial derivatives of $L(\mu)$ with respect to $\mu^k$ where $1 \leq k \leq d$ we get

$\frac{\partial L}{\partial \mu^k} = 2n\mu^k - 2\sum_{i=1}^{n} x_i^k$. Now, if we want this to vanish we get

$2n\mu^k - 2\sum_{i=1}^{n} x_i^k = 0 \Leftrightarrow \mu^k = 1/n \cdot \sum_{i=1}^{n} x_i^k$.

That is $\mu = \frac{1}{n} \begin{pmatrix} \sum_{i=1}^{n} x_i^1 \\...\\ \sum_{i=1}^{n} x_i^d \end{pmatrix} = \frac{1}{n} \sum_{j=1}^{n} x_j$.

Now you need to go on and prove this gives a global minima for $L$.

**Bumbble Comm** · Accepted Answer

Since the symbol $\mu$ is already in use in the question, let's write $\bar x$ to denote the mean of the vectors in $S$; that is, $$ \bar x = \frac1n \sum_{x\in S} x. $$

Then by the linearity of the inner product, \begin{align} \sum_{x\in S} \langle x, \mu\rangle &= \left\langle \sum_{x\in S} x, \mu\right\rangle \\ &= n \left\langle \frac1n\sum_{x\in S} x, \mu\right\rangle \\ &= n \left\langle \bar x, \mu\right\rangle. \\ \end{align}

With this, you can eliminate the individual $x$s from your last formula, leaving only $\mu$ and $\bar x$.

Now consider the quantity $\lVert \mu - \bar x\rVert_2^2$. That's something that clearly minimized when $\mu = \bar x$, since the norm $\lVert \cdot\rVert_2$ can never be less than zero. That is, $$ \arg\min_\mu \, \lVert \mu - \bar x\rVert_2^2 = \bar x \tag1 $$ So it would be really convenient if we could reduce your minimization problem to something that looks like the left-hand side of Equation $(1)$.

Now consider some of the techniques you already used in your first attempt. You know that $\lVert \mu - \bar x\rVert_2^2 = \lVert\mu\rVert_2^2 - 2 \langle\mu,\bar x\rangle + \lVert\bar x\rVert_2^2$, and you know you can add or subtract a constant from the value inside the $\arg\min$ without changing the $\mu$ that minimizes the value. Also notice that in your second attempt, you found that $$ \mu^* = \arg\min_\mu \, (- 2 \langle\bar x,\mu\rangle + \lVert\mu\rVert_2^2). $$

At this point, you're just a couple of steps away from showing that $\mu^* = \bar x$. (I'm trying not to spoil this too much, because it's so much fun when a problem resolves like this, especially when you get to make the final "aha!" step yourself.)

Showing that mean of vectors minimizes the sum of the squared distances.

There are 2 best solutions below

Related Questions in OPTIMIZATION

Related Questions in MEANS

Trending Questions

Popular # Hahtags

Popular Questions