Prove correctness of simple greedy algorithm to find max

451 Views Asked by At

We have $2n$ values $x_1,x_2,x_3,\ldots,x_n$ and $y_1,y_2,y_3,\ldots,y_n$ such that the pair $(x_i,y_i)$ represents the location of a city $i$. Assume there is no straight line that goes through all the cities.

We also have the $n$ values $z_1,z_2,z_3,\ldots,z_n$. $z_i$ represents the population of the city $i$.

Our goal is to find three cities $a,b,c$ who are not a straight line (meaning, they form a triangle) with the largest population.

The question is as follows:

Propose a greedy algorithm that solves this problem. Prove the correctness of the algorithm, and find its time complexity.

What I did

The naive greedy algorithm that comes to mind is: pick the city with the largest population $a$, then pick the city with the second largest population $b$. Remove $a,b$ from the list of cities.

$(*)$Now from the remaining cities, choose $c'$ the city with the largest population. does $c'$ form a triangle with $ab$? If yes, then define $c=c'$ and stop. If no, then remove $c'$ from the list of cities and repeat step $(*)$

This is a greedy algorithm, but I'm not sure how to prove that it is correct. I thought maybe induction is the way to go, and it's easy to see that this algorithm is correct when no $3$ cities are on a straight line. It's also easy to prove when $n-1$ cities are on a straight line. But I can't seem to prove the general case.

3

There are 3 best solutions below

0
On BEST ANSWER

Let $(a, b,c)$ be the cities your algorithm finds. Let $(u,v,w)$ be an optimal solution (also ordered by decreasing population).

Claim. $z_a=z_u$.

Proof. Assume otherwise. Then $z_a>z_u\ge z_v\ge z_w$ and $(a,v,w)$ would be strictly better, which is absurd; hence $a,v,w$ are collinear. But then $(u,a,w)$ would be strictly better than $(u,v,w)$, qea. $_\square$

Claim. $z_b=z_v$.

Proof. Assume otherwise. If $z_v>z_b$ then we would not have picked $b$ unless $v=a$. But then $u\ne a$ and we would not have picked $b$ instead of $u$. Thus $z_v<z_b$. As above, we conclude that $u,b,w$ are collinear. But then $(u,b,v)$ would be strictly better than $(u,v,w)$, qea. $_\square$

Claim. $z_c=z_w$.

Proof. Assume otherwise. Then $z_c< z_w$ by optimality of $(u,v,w)$. Then $a,b,w$ are collinear or else we would not have picked $c$ instead of $w$. Then one of $u,v$ is not on the line determined by $a,b$. Hence $z_c\ge z_v\ge z_w$, qea. $_\square$

0
On

I managed to come up with an acceptable proof (at least in my eyes). Would highly appreciate feedback.

Let $A,B$ be cities and $S$ will denote the set of all cities.

Suppose for a second that our solution must include $A$ and $B$, and our only free choice is in the third city.

In this case, the best solution will be $A+B+C$ where $C$ is the city with highest population who closes a triangle with $A$ and $B$. Obviously we can't choose a city which does not close a triangle with $A$ and $B$, so from the cities we have left, we should choose the one with highest population.

This shows that if $A$ and $B$ are the cities with highest population, then our third choice should be $C$ as I described above and in the algorithm.

What I want to show to you, is that for the optimal solution, we must choose the two maximums (namely $A$ and $B$).

Proof by contradiction:

Suppose $A=max(S)$ and $B=max(S-A)$. Let $A',B' \in S$. Suppose that the first two cities we chose are $A'$ and $B'$.

This means we also need to choose $C' \in S$ such that $A',B',C'$ close a triangle.

Now: if $A$ does not close a triangle with $A',B'$, then $C'=A$ and we have $A+A'+B'$. Since $A=max(S)$.

if $A$ does close a triangle with $A',B'$, this means they are on the same straight line. Suppose $B' < A'$. There is no advantage in choosing $B'$ over $A$, since we assumed that $A$ is the maximum, and $AA'B'$ are on the same straight line, we should have chosen $AA'$, so overall our choice is $A+A'+C'$ where $AA'$ are on the same straight line.

So we must choose $A$ to get the optimal solution. Continueing on from this point in the same manner, we can show that we must choose $B$ also to get an optimal solution. We must choose $AB$ so from the first thing I said, we should also choose $C$ which proves the correctness of the algorithm.

0
On

Suppose we have a triple of cities $(a,b,c)$ that form a triangle. If $d$ is a city with greater population than $b$ and $c$, then the triples $(a,b,d)$ and $(a,c,d)$ both have a bigger population than the original. One of those triples must be a triangle, since otherwise $b$ and $c$ would lie on the line connecting $a$ and $d$, meaning $a,b,c$ would be collinear.

It follows that, in the optimal triangle $(a,b,c)$, there is no city $d$ which has greater population than $b$ and $c$. From this, you can conclude that the optimal triangle must have the two cities with the largest population, and this justifies your algorithm.