How can I prove the optimality of the following greedy algorithm.

218 Views Asked by At

Assume that a matrix $\mathbf{R}$ of size $N\times N$ is given such that $r_{i,j} \ge r_{i,j+1}$ for any $i$ and $j$, where $r_{i,j}$ is the $i$th row and $j$th column element of $\mathbf{R}$.

I will solve the following problem: $$ \begin{array}{cl} \displaystyle \max_{K_i, i=1,\ldots,N} & \displaystyle f(K_1,\ldots,K_N)=\sum_{i=1}^N \sum_{j=1}^{K_i}r_{i,j}\\ \text{subject to} & \displaystyle \sum_{i=1}^N K_i = L, \end{array} $$ where $L$ is not greater than $N$.

Literally speaking, the above problem is to select $L$ items out of $N$ items, allowing duplicate selection, where the score obtained by selecting an item decreases as the number of times selected increases.


My greedy algorithm is as follows:

Given r[i,j] for all i and j.

for i=1:N
    K[i] = 1.
end

repeat
    i_star = argmax r[i,K_i] over i.
    K[i_star] = K[i_star] + 1.
until L = sum K[i] over all i

Obviously, the above greedy algorithm will give an optimal solution to the above problem. However, I want to prove the solution derived by the algorithm is optimal.


I am trying to prove it using mathematical induction, but my proof is not as rigid as I hope since I failed to prove the optimal substructure of the problem. The optimal substructure implies that an optimal solution when $L=k + 1$ contains an optimal solution when $L=k$.

How can I prove it mathematically rigorously?

1

There are 1 best solutions below

5
On BEST ANSWER

Let $i_1,\ldots,i_L$ be the choices made by the greedy algorithm, and consider any other solution $S$, which we think of as a multiset of indices in $[N]$. Let $i_1,\ldots,i_R$ be the maximal prefix such that $S$ contains $i_1,\ldots,i_R$. Suppose that $i_{R+1}$ appears $s$ times in $i_1,\ldots,i_R$ (and so, in $S$). Consider the solution obtained from $S$ by removing one of the elements $j$ from $S \setminus \{i_1,\ldots,i_R\}$ and replacing it by $i_{R+1}$. Suppose that $j$ appears $t$ times in $S$, and $t' \leq t-1$ times in $i_1,\ldots,i_R$. Then by doing the switch, the total value changes by $$r_{i_{R+1},s+1} - r_{j,t} \geq r_{i_{R+1},s+1} - r_{j,t'+1} \geq 0,$$ where the first inequality follows from $t'+1 \leq t$ and the nonincreasing of rows of $r$, and the second inequality follows from the definition of the greedy choice.