Find the cheapest arrangement of non-overlapping colored rectangles necessary to achieve a sequence of colors behind holes

104 Views Asked by At

Input

Let the input be a sequence of colors with any length at least 1. For example, (red, blue, red, green, blue). Each color is represented in the input as a string, not as any abstract notion of a color, so "green" is different from "#008000" is different from "the color between yellow and blue", because these are all different strings. Color is merely a convenient way to conceptualize this problem.

You can visualize the input as a sequence of holes outlined by those respective colors. Using the (red, blue, red, green, blue) example (color labels added for accessibility):

Example

Output

The output, then, would be visualized as a number of colored rectangular papers arranged behind the holes such that each hole is filled with the color it's outlined with. The papers' edges must not cross, but a paper may be placed above another if it is entirely within its bounds. Representing each hole as an asterisk and each paper as its color wrapped in parentheses with the holes it covers and any papers on top of it, here are some examples using the same input as the previous examples:

  1. (red * * *) (green *) (blue *) is NOT a valid solution because a red paper appears behind a blue hole.

    Example #1

  2. This (not representable) is NOT a valid solution because there are overlapping edges (marked by white Xs).

    Example #2

  3. (red * (blue *) *) (green *) (blue *) IS a valid solution, but not an optimal one (see later section).

    Example #3

Formalization (if it helps to understand the problem)

Each paper's configuration can be thought of as a 3-tuple: (its color string, an integer for where it starts, an integer for where it ends). Consider the position before the first hole to be 0, then 1 if it's between the first and second holes, then 2 if it's after the second, and so on.

The configuration of all the papers together can be thought of as a sequence of such tuples. The first element of the sequence is the paper you'd set down first (the bottommost paper), followed by the one you'd set down second, and so on.

What is optimal?

Each paper's cost is the number of characters in its color. For example, a "red" paper costs 3, a "blue" paper costs 4, and two "green" papers cost 2 * 5 = 10. An optimal solution is an arrangement of papers with the least total cost, and there may be multiple optimal solutions. The size of each paper has no effect on cost.

(If it helps to understand why cost depends on character counts, the motivation for this is rich text minification. I'm trying to minimize the number of characters required to encode the formatting of arbitrary rich text.)

Example #3 above has

1 red + 2 blue + 1 green

= 1 (3) + 2 (4) + 1 (5)

= 16.

Here are two examples of cheaper, optimal solutions (with cost 15) given the same input:

  1. 2 red + 1 blue + 1 green = 15

    Example #4

  2. 1 blue + 2 red + 1 green = 15

    Example #5

It isn't necessary to find every optimal solution for a given input; you only need one.

Obviously, an algorithm which comprehensively generates every possible output and compares each output's cost to find an optimal one would solve this problem. But not efficiently, as that takes exponential time (or worse) (relative to the number of holes and colors in the input). Your algorithm should ideally take polynomial time. If such an algorithm is not possible, then a polynomial-time algorithm which gets as close to the optimal output within what is possible would be acceptable instead.

Useful Considerations

Here's everything useful I've been able to come up with toward a solution (that I can remember as of writing this) so far.

  1. If a color appears only once in the input, an optimal solution can simply include a single paper on top that only covers its hole. So we might as well add the constraint that a color must appear in the input multiple times.

  2. If a color appears multiple times consecutively in the input, the problem is equivalent to as if that series of consecutive holes were only one hole. So we might as well add the constraint that the input won't contain multiple adjacent holes with the same color.

If I come up with anything else, I will edit this list.

1

There are 1 best solutions below

8
On BEST ANSWER

Because of the way the rectangles are allowed to overlap, this problem nicely decomposes into multiple smaller problems if you focus on the leftmost rectangle. Specifically, the leftmost rectangle will start before the first hole and will span across some number of other holes (say, $k > 0$ of them). That then decomposes the problem apart into two regions:

  • the first $k$ holes, which now have a “default” background color based on the color of the first rectangle, and
  • the remaining $n - k$ holes, which still have no background color.

Importantly, you’d want to use the optimal collection of rectangles in this second region, since if you didn’t you could improve the overall solution by replacing the rectangles covering the region with a better set of rectangles.

That leaves us to cover the first region, which currently has some background rectangle. Within that rectangle we have two options - we either don’t place any more rectangles down, which we can do if all the holes in the region have the same color, or there’s some leftmost rectangle we place on top. That rectangle will span from some start position to some end position, and is subject to the restriction that everything to the left of the start position must be a hole of the same color as the current background color (this is the leftmost rectangle placed on top of it) and it mustn’t cover everything (if it did, we shouldn’t use the current background rectangle). This again splits the range into some subregions: the region it covers, which uses its new background color, and the region to its right, which is covered with the existing background color. And as before, we’d like to optimally cover both regions.

There are only $O(cn^2)$ possible combinations of a range and a background color, where $c$ is the number of colors and $n$ is the number of holes. The above insights therefore suggest either using memoization or dynamic programming. Here’s a memoization-based outline of a solution:

  • To cover a range from $i$ to $j$, inclusive, which has no current background color, try all ways of placing a rectangle that starts at position $i$ and extends to some index $k \le j$ of each color that appears between $i$ and $j$, then optimally covering the holes in front of the rectangle (which now have some color behind them) and the holes not in front of the rectangle (which have no background color). Take the best option among all these options.
  • To cover a range from $i$ to $j$ with current color $c$, if all holes have color $c$ you’re done. Otherwise find the leftmost hole whose color isn’t $c$; suppose it’s at index $k$. Now, for each index $k < l \le j$, try placing a rectangle of each color spanning from $k$ to $l$, then optimally coloring the holes in that range and optimally coloring the remaining holes in the range from $k$ to $j$.