Find minimum sum of distance to all points in a vector, using an accumulator?

59 Views Asked by At

This is question will be about the theorem behind an efficient algorithm, which has been used in LeetCode problem 3086. Let me start by exposing the problem in a friendly way:

You are a bus driver who will pick rockers from the Rock'n Rio festival and take them to a car parking lot, were each rocker will pick one vehicle and leave. They are hippies, so it doesn't matter which car they pick, as long as it's the closest one. But the parking lot is long and linear, with many free and occupied car spaces, and you don't want to make them walk too much.

enter image description here

Since you are good with numbers, you have been requested to do a calculation. Before dropping the $k$ rockers, you must announce the total distance they will have to walk to reach the $k$ closest cars.

For instance, we can represent the parking lot with a list of ones and zeroes, such as this: $(1,1,1,0,0,0,0,1,1)$. In such case, if you sided your bus with the leftmost car and dropped 5 rockers, then their walking effort would be 0+1+2+7+8 = 18. But I know you are smart, so you'll choose the median car, so that the number of cars picked to the left and right is the same, thus minimizing their total walk distance. In this case, by dropping them aside of the 3rd car, their total effort will be: 2+1+0+5+6 = 14.

At least you can check the parking lot and do some annotation in forehand, before the rock festival ends. But you'll only know how many rockers you'll be carrying when the festival is over. You are expected to drop somewhere between 40 to 100 rockers at once, and by then you'll have just a couple of seconds to calculate the number to be announced.

Your son, who is a computer science jerk, came across a solution and presented it to you in steps. First, from the vector $P$ containing the status of parking slots $1...n$, calculate the vector $D$, containing the distance of one of those cars to the first one in the parking lot:

Parking lot:
$P := (1,1,1,0,0,0,0,1,1)$

Distance of each car from the leftmost car:
$D := (0,1,2,7,8)$

Then calculate the vector $V$ with the accumulation of this distance. This vector is zero-indexed and starts with an extra zero.

$$V_i = \begin{cases} 0, & \text{if $i=0$} \\ V_{i-1} + D_i, & \text{if $i>0$} \end{cases} $$

For the example above:
$V := (0,0,1,3,10,18)$

So $V_n$ represents the accumulated distance of cars $1..n$ to the first car, while $V_0$ is always zero (and $V_1$ too by the way).

Now your son told you that anytime you had skipped $s$ cars, the shortest walk distance for the next $k$ cars could be calculated with ease, using just 4 values of $V$, regardless of how many $k$ rockers you were carrying. Such calculation requires the usage of the median indexes $m_1$ and $m_2$:

$$m_1 := s + \lfloor k/2 \rfloor$$ $$m_2 := s + \lceil k/2 \rceil$$ $$dist := V_s + V_{s+k} - V_{m_1} - V_{m_2}$$

For the above example were $V := (0,0,1,3,10,18)$, if you had $k:=3$ rockers and $s:=1$ starting point, then you'd calculate:

$$\begin{align} m_1 &= 1 + \lfloor 3/2 \rfloor\\ &= 2 \end{align}$$ $$\begin{align} m_2 &= 1 + \lceil 3/2 \rceil\\ &= 3 \end{align}$$ $$ \begin{align} dist &= V_1 + V_4 - V_2 - V_3\\ &= 0 + 10 - 1 - 3\\ &= 6 \end{align} $$

Your son also drew an illustration in excel to show you how it works:

enter image description here

While this other image was done to help to understand why the result is 17. The number 17 represents the amount of car spaces walked from the median car to the 3 adjacent cars at left and at right, covering a 7-car range.

enter image description here

Now your bosses and the other bus drivers are very impressed that it works, so they asked you to show them the mathematical theory behind those calculations. How could you explain it?

1

There are 1 best solutions below

0
On

Ok I'll try to answer my own question, but I's not quite complete, so please help me to improve on it.

  1. Given a vector $D$ containing the position of all cars in ascending order, the walk distance from car $x$ to car $y$ where $x < y$ is calculated as $D_y - D_x$. Hence, the distance function is: $$ f(x, y) = \begin{cases} D_y - D_x, & \text{if $x<y$} \\ 0, & \text{if $x=y$} \\ D_x - D_y, & \text{if $x>y$} \end{cases} $$

  2. And for the summed walk from car $a$ (as a fixed start point) to all adjacent cars up to car $b$, we can define another function $fr$ for that range of cars: $$ \begin{align} fr(\underline{a}, b) &= \sum_{i=a}^b f(a, i)\\ &= \sum_{i=a}^b \begin{cases} D_a - D_i, & \text{if $i<a$} \\ 0, & \text{if $i=a$} \\ D_i - D_a, & \text{if $i>a$}\\ \end{cases}\\ &= \begin{cases} \sum_{i=a}^b (D_a - D_i), & \text{if $a>b$} \\ 0, & \text{if $a=b$} \\ \sum_{i=a}^b (D_i - D_a), & \text{if $a<b$} \end{cases} \end{align} $$

  3. Now, if given a range of cars $\{D_a,\dotsc,D_b\} \in D$, we will calculate the total walk distance starting from one of those cars to each one of the others. First, we should split those cars in 3 groups $L, M, R$ following the same logic described above. We start by picking a start position $s$, which may or not coincide to be the a car index, which makes the group $M$ will have 0 to 1 cars. Then we allow $L$ to have the cars at left, indexed within $(a,am)$ and $R$ to have the cars at right, indexed within $(bm,b)$, where $D_{am} \le S \le D_{bm}$. With that, we can define $fr2$ function: $$ \begin{align} s &\in (D_a,D_b)\\ L &= \{D_a,\dotsc,D_{am}\}\\ M &= \{\} \space OR \space \{D_m\}\\ R &= \{D_{bm},\dotsc,D_b\}\\ L \cup M \cup R &= \{D_a,\dotsc,D_b\} = D\\ \end{align}$$ $$fr2(a, s, b) = \sum_{i=a}^{am} (s - D_i) + \sum_{i=bm}^b (D_i - s)$$ And if the number or cars in left and right are balanced, then we can simplify to: $$fr2(a, s, b) = \sum_{i=bm}^b D_i - \sum_{i=a}^{am} D_i \tag{if $|L| = |R|$}$$

  4. Given the constraints above, we can always make sure the left and right cars are balanced. And this will always minimize the total walk distance. While the left and right cars are balanced, walking $x$ units to left or right won't change the total walk distance since the delta change will be $x \times r - x \times l = 0$. This might be provable using limits, but for now here is just an intuitive demonstration:

    "Imagine we had each of those cars attracting the bus with equal force, as an effort to decrease the distance. The bus would then have $r$ cars pulling it to the right and $l$ cars pulling it to the left. The bus would move towards the side with greater amount of cars, $l$ or $r$. As the bus moves, $r$ and $l$ are re-balanced, until they get even. By the end, $S$ should have been shifted to the median car (in case it's an odd number of cars) or anywhere in between the two median cars (in case its an even number of cars).

  5. Also, if we pre-calculate another vector $V$ as the cumulative sum of $D$, then $V$ can be used to ease the calculation of the a walk distance. And allowing $V_0=0$ will also simplify the equations: $$V_i = \sum_{j=1}^i D_j$$ $$ \begin{align} \sum_{i=a}^b D_i &= \sum_{i=1}^b D_i - \sum_{i=1}^{a-1} D_i\\ &= V_b - V_{a-1} \end{align} $$ Which brings us to: $$\begin{align} fr2(a, s, b) &= \sum_{i=bm}^b D_i - \sum_{i=a}^{am} D_i \tag{if $|L| = |R|$}\\ &= (V_b-V_{bm-1}) - (V_{am} - V_{a-1})\\ &= V_b + V_{a-1} - V_{bm-1} - V_{am} \end{align}$$

  6. Finally, to choose suitable values of $am$ and $bm$, we can use this definition:

$$\begin{align} m = (b-a)/2 \tag{possibly not an integer}\\ am = \lfloor m \rfloor\\ bm = \lceil m \rceil\\ \end{align}$$