Is there an analytical way to solve for $\mu$ to minimize $\sum{|x_i-\mu|^3}$?

83 Views Asked by At

If I want to find a value of $\mu$ such that $\sum{|x_i-\mu|^3}$ is minimized for $x_i \in \mathbb{R}$ with a total of $N$ values of $x_i$, I can calculate the first derivative of the sum as

$3 \sum{(x_i-\mu)^2 * \text{sign}(x_i-\mu)}$

which should be equal to 0 at the true minimum. If I were to represent this more completely, I'd have a unique term for each unique value of $x_i$.

I'm not even sure if there is a unique value of $\mu | X$ for all possible sets of $x_i \in X$.

Is there an analytic way of solving this, or do I need to use a computational solver (e.g., one based on Newton-Raphson)?

Similar Problems

When solving for the case of $\sum(|x_i-\mu|)$, the solution is simply the median (for an odd number of $x_i$) or any value between the two values about the median (for an even number of $x_i$), and for the case of $\sum((x_i-\mu)^2)$, it is simply the average of all $x_i$.

If I extended this to the case for $\sum((x_i-\mu)^4)$, my (expanded) derivative would be

$4\sum({x_i}^3) - 12\mu\sum({x_i}^2) + 12\mu^2\sum({x_i}) - 4N{\mu}^3$

which is a cubic equation that can be solved.

As the exponent goes to infinity, $\mu$ approaches the value that has the smallest maximum distance from any point. As it goes to 0, it takes the value of the points with the most common frequency (not necessarily a unique possible value of $\mu$, just like when solving for absolute value).

1

There are 1 best solutions below

0
On

Within each interval between two values you have fixed signs for the differences, so the absolute value function in the objective may be replaced by the appropriate sign at each data point. You then have a soluble quadratic equation for a zero derivative whose relevant root (check the cubic function value at each critical point to find which is the minimum) is either in the interval or not. If the minimum is in your chosen interval, you're good. Otherwise (including the possibility of finding no root at all) move to a different interval, in the direction where the cubic function you calculated is decreasing.

Let us say your data points are $2,4,5,7,9$. Assume, deliberately wrongly, that the minimum lies between $4$ and $5$ (since the mean $=2$-norm fit is $5.4$ and the average of just the min and max $=\infty$-norm fit, is $5.5$, by "extrapolation" we expect your target to be between $5.4$ and $5.5$). Then $x-\mu$ is negative for the $2$ and $4$ data points and positive for the others, and your objective function is accordingly

$-(2-\mu)^3-(4-\mu)^3+(5-\mu)^3+(7-\mu)^3+(9-\mu)^3$

$=-\mu^3+45\mu^2-405\mu+1125$

whose critical points are then given by

$-3\mu^2+90\mu-405=0, \mu=15\pm3\sqrt{10}$

Since the cubic has a negative leading coefficient, the smaller root corresponds to our minimum but it is greater than $5$ ($\approx5.513$). Our objective function is in fact monotonically decreasing all through the relevant interval $[4,5]$, so a higher interval is required.

Now, try the interval where we expect to find the minimum, between $5$ and $7$. This reverses the sign in the third cube of the objective function

$-(2-\mu)^3-(4-\mu)^3-(5-\mu)^3+(7-\mu)^3+(9-\mu)^3$

$=\mu^3+15\mu^2-255\mu+875$

with zero derivative condition

$3\mu^3+30\mu-255=0,\mu=-5\pm\sqrt{110}$

This time, with the cubic having a positive leading coefficient, the larger root corresponds to the minimum and it is approximately $5.488$, which lies in our new trial interval and moreover between the expected bounds of $5.4$ and $5.5$ set by the overall and max-min averages calculated earlier. Thus $-5+\sqrt{110}$ is the correct optimum.

The smart thing to do is to calculate the overall and max-min averages first, to bound your $3$-norm optimum. The bounds may in fact not be rigorous (I have not proven them so), but they will likely be good for most distributions you will likely encounter. This usually defines a single interval, or at worst a relatively small set of intervals, to provide an initial guess, thus minimizing the need to iterate.