Maximize data size for copy/paste operations

166 Views Asked by At

Assuming it takes 2 seconds to copy and 1 second to paste, find the optimal order of copy/paste operations to maximize the amount of data for a given period of time. Start with a data size of 1. For example three paste operations would yield a data size of 4 in 3 seconds, while three copy/paste operations would yield a data size of 8 in 9 seconds.

I'd like to know if there's a a clean, mathematical way to approach this type of problem, without resorting to writing scripts.

I wrote a Python script to look at different sequences of operations where "ccc" means three copy/paste operations, "ppc" means two plain pastes, and one copy/paste, etc. I think that sequences like "pcpcpc..." are more efficient in general than those like "ppp...ccc...", but I haven't proven that.

def process( ops ):
    total_data = 1
    total_time = 0
    paste_time = 1
    copy_time = 2
    selection = total_data
    print "ops {}".format( ops )
    for item in ops:
        if item == "c":
            total_time += copy_time + paste_time
            selection = total_data
            total_data += selection
        elif item == "p":
            total_time += paste_time
            total_data += selection
        print "    data {:3}    time {:3}    selection {:3}".format(total_data, total_time, selection)
    print float(total_data)/total_time

Here are some examples

>>> process("ppppcccc")
    ops ppppcccc
        data   2    time   1    selection   1
        data   3    time   2    selection   1
        data   4    time   3    selection   1
        data   5    time   4    selection   1
        data  10    time   7    selection   5
        data  20    time  10    selection  10
        data  40    time  13    selection  20
        data  80    time  16    selection  40
    5.0

>>> process("pcpcpcpc")
    ops pcpcpcpc
        data   2    time   1    selection   1
        data   4    time   4    selection   2
        data   6    time   5    selection   2
        data  12    time   8    selection   6
        data  18    time   9    selection   6
        data  36    time  12    selection  18
        data  54    time  13    selection  18
        data 108    time  16    selection  54
    6.75
1

There are 1 best solutions below

0
On BEST ANSWER

We can divide any such sequence of operations into blocks of the form c, cp, cpp, cppp, and so on. (Except for the very beginning of the string, where we have a block of the form p, pp, ppp, pppp, and so on, costing us $2$ seconds less. But since this is a constant savings of $2$ seconds, we can ignore it, since it applies no matter what we do.)

A block of the form cpp...p with $k$ ps will multiply the length of the string by $k+2$ over $k+3$ seconds. The easiest thing to notice about these blocks is that we can restrict ourselves to just the four blocks c, cp, cpp, cppp, because:

  • The block cpppp multiplies the length of the string by $6$ over $7$ seconds, but we can accomplish the same thing with c followed by cp.
  • The block cppppp multiplies the length of the string by $7$ over $8$ seconds, but cp followed by cp multiply the length by $9$ over the same $8$ seconds, which is strictly better.
  • Similarly, cpppppp is strictly worse than cp followed by cpp, nad so on.

Now we should figure out how these four blocks compare. They all have a multiplicative effect on the length of the string, so it's easier to compare them by looking at the logarithm of the length instead:

  • c adds $\log 2$ to the log-length over $3$ seconds, increasing it by $\frac{\log 2}{3}$ per second.
  • cp increases the log-length by $\frac{\log 3}{4}$ per second.
  • cpp increases the log-length by $\frac{\log 4}{5}$ per second.
  • cppp increases the log-length by $\frac{\log 5}{6}$ per second.

We have $$\frac{\log 4}{5} > \frac{\log 3}{4} > \frac{\log 5}{6} > \frac{\log 2}{3}$$ so in general, the cpp block is the most cost-efficient. For example, this lets us conclude that pppcppcpp produces the largest data size possible in $13$ seconds: a total length of $64$.

The cpp block takes $5$ seconds, and the initial ppp block takes $3$ seconds, so when the total time $t$ is not of the form $5k+3$ for some $k$, it makes sense to use a few of the other blocks to make good use of the remaining seconds. We can check how best to do this by brute force, but it never makes sense to use more than $4$ of one of the other blocks: for example, $5$ cppp blocks could be replaced by $6$ cpp blocks to multiply the length by $4096$ instead of $3125$ in the same amount of time. So there's only finitely many cases to check, and then we have an answer for all time lengths $t$.