The task is to specify an algorithm which, given a bound $k \geq 1$ and a number $N$, finds a sequence consisting of numbers from $1,\ldots, k$ and operations from $+, -, *, /, \%, @ $ -- where $/$ represents floored integer division , $\%$ represents modulo and $@$ is a unary copy operation -- that minimizes the cost of creating $N$. The cost of using a number is the number itself and the cost of using an operation is $1$. [For my purposes, it is enough if the algorithm works fine only for $1\leq N \leq 256$.]
Examples (in postfix notation, evaluation is from left to right):
- the sequence $5,@,+,3,-$ creates 7 with cost 11: $$5,@,+,3,- \rightarrow 5,5,+,3,- → 10,3,- \rightarrow 7$$
- if $k\geq 3$, $9$ can be created with minimum cost 5: $3, @, * \rightarrow 3, 3, * \rightarrow 9$
- if $k=1$, then creating $2$ has a minimum cost 3 as using $2$ itself is not allowed: $1, 1, +$ → $2$.
What I've thought of: for prime numbers $p$, one could just look at $p \pm 1$ or $p\pm 2$ or $p \pm 3$ and try to create $p$ using addition/subtraction. For composite numbers $c$, try creating $c$ using its prime factors and compare this cost with the cost of creating it from nearby composite numbers using addition/subtraction. One would also try to use $@$ to reduce the cost of creating the factors themselves.
(Background: I'm trying to create "large numbers" with few operations in the esolang Piet.)
An optimal search algorithm such as branch and bound will find the lowest-cost ways of achieving these numbers. All you have to do is cast it as a formal search problem. For example, you could use:
To streamline the search, you can also incorporate an extended list and use an admissible heuristic (an optimistic estimate of how many operations you have left before you reach the goal state) in computing the cost of a path.
One admissible heuristic for this problem is abs( len(sequence) - 1 ), since if you have a sequence of length $n$, the quickest you could reduce it to a sequence of length 1 is by performing $|n-1|$ operations.
I wrote a Python script to generate such answers using branch-and-bound search. It's not very fast, but it produces optimal cost computations like the following, where here the limit is 20. (Note: These results use + and @ a lot, since I picked a certain tiebreaking order when choosing between operations. Presumably there are other solutions with equally low cost that are more visually interesting.)
It is very interesting to note that certain patterns keep reoccuring. In fact, if you view this problem as attempting to find the shortest path between various tuples, you can use the dynamic programming principle to significantly speed up the algorithm.
Note that the highest-cost number is 177, at a cost of 15. Also note that all of the programs use constants at 7 or lower (for example, 49 is generated using 7@*), even though they were permitted to use constants up to 20.
With some logical reasoning, you can discover the following exciting result:
In our particular case, when we generate all the numbers between 1...256 using constants in the range 1...20, we yield programs that cost no more than 15 and use constants only up to 7. Because 15 < 20, these are the optimal programs when our constants are in the range [1...7], and even when we extend our constants to any larger set (!).
Edit: Python code https://pastebin.com/9kiLL3x5
Edit #2: I re-ran the search, this time preferring, as a tiebreaker, programs that use smaller numbers. I realized that all of the numbers between 1...256 can be generated using only numbers in the range 1..4, and that these programs are (almost!) optimally-costed for any greater range.
This optimization is possible partly because we can always make the following same-cost substitutions:
And so we can eliminate all previous instances of 5,6,7 without affecting the path cost, the only exception being for the programs that used the value 5.
Some notation: if $P$ is a program, let $|P|$ denote its cost.
We have the following two fantastic results:
Theorem: Every $n$ can be generated using a program which costs at most n and which uses only the constants 1...5.
Corollary: For any $n$, there is an optimal program which computes $n$ uses only the constants 1...5. This program is absolutely the cheapest possible: every other program which produces $n$, no matter what constants used, costs at least as much.
Let's first show that if the theorem is true, the corollary follows. Let $P_n$ denote an optimal-cost program for generating $n$ under the restriction that $P_n$ can only use the constants 1...5, and let $Q_n$ denote an optimal-cost program for generating $n$ when there is no restriction on the constants. Suppose $Q_n$ uses a constant $k>5$. Then we can make a new program $Q^\prime_n$ which is like $Q_n$ except all instances of $k$ have been replaced with the program $P_k$ (which, by definition, uses only the numbers 1...5). According to the theorem, $|P_k| \leq k$; hence this substitution $k\mapsto P_k$ can't increase the program's cost. Hence by repeating this procedure until no more constants $k>5$ remain, we arrive at a program $Q_n^*$ which costs no more than $Q_n$ but which only uses constants in the range 1...5. By optimality, it follows that $|P_n| = |Q_n|$, so $P_n$ is the cheapest program for generating $n$ when only the constants 1...5 are allowed, and also when any additional constants are allowed. Because diminishing the set of constants can also only increase costs, it follows that $P_n$ is the cheapest program for generating $n$ regardless of which constants are allowed.
So the theorem implies the corollary. Now let's prove the theorem.
Proof of theorem. As before, let $P_n$ denote the optimal program for generating $n$ subject to the restriction that it only use constants 1...5.
We'll prove that this theorem holds for all $n$ by dividing the integers into particular intervals and proving the theorem in each interval.
First, the theorem holds in region $A$, as our computer-provided evidence showed.
Next we'll prove, by induction, a doubly-strong result about how fast the programs grow in $B_i$; this will establish the theorem a fortiori :
Base case: Evidently the theorem holds in $B_0$; $B_0$ consists of numbers between 34 and 256, and our computer-provided evidence shows that $|P_n| \leq 15 \lneqq n/2 - 1$.
Induction step: If the theorem holds in all $B_i$ $(i=0\ldots N)$, let $n$ be an integer in $B_{N+1}$. Note that $n$ is large enough that we can always write $n = u + v$ for two integers $u$ and $v$ not in $A=[1...34]$. Hence $u$ and $v$ must be in some of the $B_i$. But by the induction hypothesis, the theorem then holds for $u$ and $v$.
Let's consider the following (possibly suboptimal) program for constructing $n$:
$$\widetilde P_n \equiv \langle P_u, P_v, +\rangle$$
This program first puts $u$ in memory using $P_u$, then puts $v$ in memory using $P_v$, then adds the results together, yielding $n = u+v$.
Considering its components, this program uses only the constants 1...5. By the induction hypothesis, the cost of this program is strictly less than $(\frac{u}{2}-1) + (\frac{v}{2}-1) + 1$. Hence the cost of this program is strictly less than $\frac{n}{2} - 1$. This shows that every $n$ in $B_{N+1}$ can be generated using a program that uses only the constants 1...5 and costs at most $\frac{n}{2}-1$, establishing the induction hypothesis.
This completes the proof of the theorem; we conclude that every $n$ can be generated using a program which costs at most $n$ and which uses only the constants 1...5 □.
One final comment:
Now I am trying to establish bounded relationships between the length of a program, the cost of a program, and the size of the number it produces.
As for length, I note that every program must start with a number. I also note that after the first number, every subsequent increment in stack length (using @ or another number) must be paired with a decrement in stack length (using +, *, etc.). Hence every number-generating program has odd length, and is recognized by the context-free language:
$$\begin{align*} S & \mapsto K \,|\, K T \\ K & \mapsto 1 \,|\, 2 \,|\, 3 \,|\, 4 \,|\, 5 \\ T & \mapsto PQ \,|\, PTQ \,|\, TT \\ P & \mapsto K \,|\, @\\ Q & \mapsto + \,|\, - \,|\, \times \,|\, \div \,|\, \% \\ \end{align*}$$
So intuitively, it seems like if you specify a(n odd) program length $n=2k+1$, the largest number you can generate using an optimal program of length $n$ is $$5 @ * @ * \ldots @ * = 5^{(2^k)}$$ with a corresponding cost of $5 + 2k = (n + 4)$.
This does not mean that the most expensive optimal program of length $n$ costs $n+4$; rather the highest-result optimal program of length $n$ costs $n+4$.
As for how much an optimal program of length $n=2k+1$ can cost, we know that it must be at least $n$ (when each operation costs 1) and certainly no more than the obviously suboptimal program $[\underbrace{5,\cdots, 5}_{k+1}, \underbrace{+, \cdots, +}_k] = 5(k+1) + k = 6k+5 = 3n+2$. (It's suboptimal because it doesn't use @.)