Using greedy algorithm to get the minimum total cost of adding a whole array

415 Views Asked by At

I meet a question about calculating adding all the elements of an array, each time we can only add two numbers, and the cost is the sum of these two numbers.

so my solution (maybe not correct) is to use greedy method, always add the smallest two numbers in the current array, and put this result back into the array, and repeat this step until there is only one number in array.

like:

given array = [1,2,3,4,5]

step1: 1 + 2 = 3, the current array is [3,3,4,5]

step2: 3 + 3 = 6, the current array is [4,5,6]

step3: 4 + 5 = 9, the current array is [6,9]

step4: 6 + 9 = 15.

the total cost is 3 + 6 + 9 + 15. I think this is the minimum cost.

I am not sure how to prove my solution, I just feel this algorithm is correct, but after using exchange argument (only proof method I learned) to prove it, I cannot find a right logic.

Can anyone give me any hints for this proof if my solution is correct?

1

There are 1 best solutions below

0
On

Suppose that initial numbers are $a_1 \le a_2 \le \cdots \le a_n$. Let consider an optimal summation order with the summation cost $$c = k_1 \cdot a_1 + k_2 \cdot a_2 + \cdots + k_n \cdot a_n$$ for some positive integer $k_1, k_2, \ldots, k_n$, where coefficient $k_i$ means that the number $a_i$ participates in $k_i$ summations as a number or a part of sum$^1$. If $k_i < k_j$ for some $i < j$ then $a_i = a_j$. Otherwise we could swap $a_i$ and $a_j$ in the initial order to get a better cost $$c' = c - k_i \cdot a_i - k_j \cdot a_j + k_i \cdot a_j + k_j \cdot a_i = c - (k_j - k_i) \cdot (a_j - a_i) < c.$$ That's why changing order to make $k_1 \ge k_2 \ge k_3 \ge \cdots \ge k_n$ we don't increase the cost. As soon as $a_1$ is summed up for the first time together with some other element $a_i$ (no matter alone or as a part of some sum) all other summations they meet together, so $k_1 \le k_i$. Therefore $k_1 = k_2$. Let $k_{n + 1} = 0$ and $t$ be such an index that $k_1 = k_t > k_{t + 1}$. Note that each number from $a_{[t]} = \{\,a_1, a_2, \ldots, a_t\,\}$ is summed up for the first time with other single number from $a_{[t]}$. Otherwise subelements of the other summand would have greater $k_i$. Besides that $t$ is even, this means that we can change corresponding first-time-summation matching to contain pair $\{\,a_1, a_2\,\}$. And yes, this pair can be the first step. So we have proven that at least one optimal summation order contains pair $\{\, a_1, a_2\,\}$ as the first couple. Since we proved this statement in general it is also true for numbers $a_1 + a_2, a_3, a_4, a_5, \ldots, a_n$ (if we place $a_1 + a_2$ in proper position) and so on. This means that your summation approach is optimal in sense of cost minimization.


$^1$ Essentially we can consider the whole process as a set union (with the cost equal to the sum of elements in the resulting set) rather than a summation.