I saw that there was a question with a very similar (if not identical) flavour to my question, but the answer was derived from derivative, the method that I already knew solved this problem.
I feel that it is "obvious" that the value that minimizes the sum of the euclidean distance from points, i.e. find a z that minimizes:
$$\sum^{k}_{i=1}\|x_i - z\|^2$$
I know the solution can be obtained with derivatives and that $z = \frac{\sum^k_{i=1} x_i}{k}$ but even before I tried solving it with derivatives, it seemed "obvious" that was the case and I felt that solving it using derivatives is the correct approach but seemed over doing it for the simple problem.
I was wondering if anyone had a intuitive argument for this solution. It just seemed so obvious and there is a way of doing it rigorously, but I was more interested if someone knows intuitively why that had to be solution. Maybe there isn't but I am just curious to know if someone had a alternative view for the problem/solution.
Thanks in advance!
Here are two ways of viewing it. The second may be (for some people?) more "intuitive":
First way: $$ \sum_{i=1}^k (x_i - z)^2 = \sum_{i=1}^k \Big((x_i - m)^2 + 2(x_i-m)(m-z) + (m-z)^2\Big). $$
In the sum of the middle term, $\displaystyle\sum_{i=1}^k 2(x_i-m)(m-z)$, the factor $2(m-z)$ does not depend on the index $i$, i.e. does not change as $i$ goes from $1$ to $k$, hence this sum is $\displaystyle 2(m-z)\sum_{i=1}^k (x_i-m)$.
That sum is $0$ if and only if $m=\bar x = (x_1+\cdots+x_n)/n$.
In the last term, $\displaystyle\sum_{i=1}^k (m-z)^2$, the whole expression $(m-z)^2$ does not change as $i$ goes from $1$ to $k$, so it's a sum of $k$ terms that are all equal; hence it is $k(m-z)^2$.
Therefore $$ \sum_{i=1}^k (x_i-z)^2 = k(\bar x - z)^2 + \sum_{i=1}^k (x_i-\bar x)^2. $$ Since $z$ appears only in the first term of this last expression, the value of $z$ that minimizes that is the value of $z$ that minimizes the first term.
That's one way to show that the least-squares estimate of the population mean is the sample mean.
Second way:
But now let's look at it geometrically: $$ \begin{bmatrix} x_1 \\ \vdots \\ x_n \end{bmatrix} = \begin{bmatrix} \bar x \\ \vdots \\ \bar x \end{bmatrix} + \begin{bmatrix} x_1 - \bar x \\ \vdots \\ x_n - \bar x \end{bmatrix}. $$ The first term on the right is the orthogonal projection of the vector on the left onto a certain one-dimensional subspace of $\mathbb R^n$. The second term on the right is the orthogonal projection of the same vector onto the complementary $(n-1)$-dimensional subspace. The vector in a subspace that is nearest to a vector not in the subspace, in terms of Euclidean distance, is the orthogonal projection onto the subspace.