Making numbers cheaply

234 Views Asked by At

The task is to specify an algorithm which, given a bound $k \geq 1$ and a number $N$, finds a sequence consisting of numbers from $1,\ldots, k$ and operations from $+, -, *, /, \%, @ $ -- where $/$ represents floored integer division , $\%$ represents modulo and $@$ is a unary copy operation -- that minimizes the cost of creating $N$. The cost of using a number is the number itself and the cost of using an operation is $1$. [For my purposes, it is enough if the algorithm works fine only for $1\leq N \leq 256$.]

Examples (in postfix notation, evaluation is from left to right):

  • the sequence $5,@,+,3,-$ creates 7 with cost 11: $$5,@,+,3,- \rightarrow 5,5,+,3,- → 10,3,- \rightarrow 7$$
  • if $k\geq 3$, $9$ can be created with minimum cost 5: $3, @, * \rightarrow 3, 3, * \rightarrow 9$
  • if $k=1$, then creating $2$ has a minimum cost 3 as using $2$ itself is not allowed: $1, 1, +$ → $2$.

What I've thought of: for prime numbers $p$, one could just look at $p \pm 1$ or $p\pm 2$ or $p \pm 3$ and try to create $p$ using addition/subtraction. For composite numbers $c$, try creating $c$ using its prime factors and compare this cost with the cost of creating it from nearby composite numbers using addition/subtraction. One would also try to use $@$ to reduce the cost of creating the factors themselves.

(Background: I'm trying to create "large numbers" with few operations in the esolang Piet.)

1

There are 1 best solutions below

16
On BEST ANSWER

An optimal search algorithm such as branch and bound will find the lowest-cost ways of achieving these numbers. All you have to do is cast it as a formal search problem. For example, you could use:

  • The states are sequences of numbers that are stored in memory. The starting state is an empty sequence []. The goal state is the singleton [x], where x is the number you're trying to compute.
  • The operators are the operations such as +, -, @, /, ... which transform one state into another state. For example, if you have a state (2,3,4) and apply the operator +, you get the state (2,7).
  • A path leading to a particular state would be a sequence of operators that carry you from the empty state to the particular state.
  • The cost of a path is the total cost of each of the operators used.

To streamline the search, you can also incorporate an extended list and use an admissible heuristic (an optimistic estimate of how many operations you have left before you reach the goal state) in computing the cost of a path.

One admissible heuristic for this problem is abs( len(sequence) - 1 ), since if you have a sequence of length $n$, the quickest you could reduce it to a sequence of length 1 is by performing $|n-1|$ operations.


I wrote a Python script to generate such answers using branch-and-bound search. It's not very fast, but it produces optimal cost computations like the following, where here the limit is 20. (Note: These results use + and @ a lot, since I picked a certain tiebreaking order when choosing between operations. Presumably there are other solutions with equally low cost that are more visually interesting.)

It is very interesting to note that certain patterns keep reoccuring. In fact, if you view this problem as attempting to find the shortest path between various tuples, you can use the dynamic programming principle to significantly speed up the algorithm.

== OPTIMAL-COST COMPUTATIONS FOR LIMIT 20 ==
1 1
2 2
3 3
4 4
5 5
6 3,@,+
7 7
8 4,@,+
9 3,@,*
10 5,@,+
11 3,@,*,2,+
12 3,@,+,@,+
13 3,@,+,@,+,1,+
14 4,@,*,2,-
15 4,@,*,1,-
16 4,@,*
17 4,@,*,1,+
18 3,@,*,@,+
19 3,@,*,@,+,1,+
20 4,@,1,+,*
21 3,@,@,+,1,+,*
22 3,@,*,2,+,@,+
23 5,@,*,2,-
24 5,@,*,1,-
25 5,@,*
26 5,@,*,1,+
27 3,@,@,*,*
28 3,@,@,*,*,1,+
29 3,@,@,*,*,2,+
30 5,@,1,+,*
31 4,@,*,@,+,1,-
32 4,@,*,@,+
33 4,@,*,@,+,1,+
34 3,@,+,@,*,2,-
35 3,@,+,@,*,1,-
36 3,@,+,@,*
37 3,@,+,@,*,1,+
38 3,@,+,@,*,2,+
39 3,@,@,+,@,*,+
40 3,@,*,@,*,2,/
41 3,@,+,@,1,+,*,1,-
42 3,@,+,@,1,+,*
43 3,@,+,@,1,+,*,1,+
44 3,@,+,@,1,+,*,2,+
45 3,@,@,2,+,*,*
46 3,@,1,+,@,*,*,2,-
47 3,@,1,+,@,*,*,1,-
48 3,@,1,+,@,*,*
49 7,@,*
50 5,@,*,@,+
51 5,@,*,@,+,1,+
52 5,@,*,1,+,@,+
53 3,@,@,*,*,@,+,1,-
54 3,@,@,*,*,@,+
55 3,@,@,*,*,@,+,1,+
56 4,@,+,@,1,-,*
57 3,@,@,*,@,+,1,+,*
58 3,@,@,*,*,2,+,@,+
59 4,@,@,*,1,-,*,1,-
60 4,@,@,*,1,-,*
61 4,@,+,@,*,3,-
62 4,@,+,@,*,2,-
63 4,@,+,@,*,1,-
64 4,@,+,@,*
65 4,@,+,@,*,1,+
66 2,@,@,+,@,+,@,*,+
67 3,@,1,+,@,+,@,*,+
68 4,@,@,+,@,*,+
69 4,@,@,+,@,*,+,1,+
70 3,@,+,@,*,1,-,@,+
71 3,@,+,@,*,@,+,1,-
72 3,@,+,@,*,@,+
73 3,@,+,@,*,@,+,1,+
74 3,@,+,@,*,1,+,@,+
75 3,@,2,+,@,*,*
76 3,@,+,@,*,2,+,@,+
77 3,@,*,@,*,4,-
78 3,@,*,@,*,3,-
79 3,@,*,@,*,2,-
80 3,@,*,@,*,1,-
81 3,@,*,@,*
82 3,@,*,@,*,1,+
83 3,@,*,@,*,2,+
84 3,@,@,*,@,*,+
85 3,@,@,*,@,*,+,1,+
86 3,@,@,*,@,*,+,2,+
87 3,@,@,@,*,@,*,+,+
88 3,@,*,@,1,+,*,2,-
89 3,@,*,@,1,+,*,1,-
90 3,@,*,@,1,+,*
91 3,@,*,@,1,+,*,1,+
92 3,@,*,@,1,+,*,2,+
93 3,@,@,*,@,1,+,*,+
94 3,@,1,+,@,*,*,1,-,@,+
95 3,@,1,+,@,*,*,@,+,1,-
96 3,@,1,+,@,*,*,@,+
97 5,@,+,@,*,3,-
98 7,@,*,@,+
99 3,@,*,@,2,+,*
100 5,@,+,@,*
101 5,@,+,@,*,1,+
102 5,@,+,@,*,2,+
103 3,@,@,*,1,+,@,*,+
104 4,@,1,+,@,+,@,*,+
105 5,@,@,+,@,*,+
106 3,@,@,+,@,*,*,2,-
107 3,@,@,+,@,*,*,1,-
108 3,@,@,+,@,*,*
109 3,@,@,+,@,*,*,1,+
110 5,@,+,@,1,+,*
111 3,@,@,+,@,*,1,+,*
112 4,@,+,@,1,-,*,@,+
113 3,@,@,*,1,+,@,1,+,*,+
114 3,@,@,+,@,*,2,+,*
115 5,@,@,*,2,-,*
116 4,@,@,1,+,@,*,+,*
117 3,@,@,@,+,@,*,+,*
118 3,@,@,@,+,@,*,+,*,1,+
119 3,@,*,2,+,@,*,2,-
120 5,@,@,*,1,-,*
121 3,@,*,2,+,@,*
122 3,@,*,2,+,@,*,1,+
123 5,@,@,*,*,2,-
124 5,@,@,*,*,1,-
125 5,@,@,*,*
126 5,@,@,*,*,1,+
127 5,@,@,*,*,2,+
128 4,@,+,@,*,@,+
129 4,@,+,@,*,@,+,1,+
130 5,@,@,*,1,+,*
131 3,@,+,@,+,@,1,-,*,1,-
132 3,@,+,@,+,@,1,-,*
133 3,@,+,@,+,@,1,-,*,1,+
134 3,@,1,+,@,+,@,*,+,@,+
135 5,@,@,*,2,+,*
136 4,@,@,+,@,*,+,@,+
137 3,@,*,@,1,-,@,*,@,+,+
138 3,@,@,*,-,@,@,+,@,*,%
139 1,3,@,+,@,@,+,@,*,-,-
140 3,@,+,@,*,1,-,@,+,@,+
141 3,@,+,@,+,@,*,3,-
142 3,@,+,@,+,@,*,2,-
143 3,@,+,@,+,@,*,1,-
144 3,@,+,@,+,@,*
145 3,@,+,@,+,@,*,1,+
146 3,@,+,@,+,@,*,2,+
147 3,@,@,+,1,+,@,*,*
148 4,@,3,*,@,*,+
149 3,@,+,@,1,-,@,*,*,1,-
150 3,@,+,@,1,-,@,*,*
151 3,@,+,@,1,-,@,*,*,1,+
152 3,@,*,@,@,+,1,-,*,1,-
153 3,@,*,@,@,+,1,-,*
154 3,@,*,@,@,+,1,-,*,1,+
155 3,@,+,@,+,@,1,+,*,1,-
156 3,@,+,@,+,@,1,+,*
157 3,@,+,@,+,@,1,+,*,1,+
158 3,@,*,@,*,2,-,@,+
159 3,@,*,@,*,@,+,3,-
160 3,@,*,@,*,1,-,@,+
161 3,@,*,@,*,@,+,1,-
162 3,@,*,@,*,@,+
163 3,@,*,@,*,@,+,1,+
164 3,@,*,@,*,1,+,@,+
165 3,@,@,*,@,*,@,+,+
166 3,@,*,@,*,2,+,@,+
167 3,@,@,*,@,*,+,@,+,1,-
168 3,@,@,*,@,*,+,@,+
169 3,@,+,@,+,1,+,@,*
170 3,@,+,@,+,1,+,@,*,1,+
171 3,@,*,@,@,+,1,+,*
172 3,@,*,@,@,+,1,+,*,1,+
173 3,@,*,@,@,*,1,+,@,+,+
174 3,@,@,@,*,@,*,+,+,@,+
175 5,@,@,2,+,*,*
176 3,@,*,@,1,+,*,2,-,@,+
177 3,@,*,@,1,+,*,@,+,3,-
178 3,@,*,@,1,+,*,1,-,@,+
179 3,@,*,@,1,+,*,@,+,1,-
180 3,@,*,@,1,+,*,@,+
181 3,@,*,@,1,+,*,@,+,1,+
182 2,@,@,+,@,*,-,@,1,+,*
183 3,@,@,*,@,1,+,*,@,+,+
184 3,@,*,@,1,+,*,2,+,@,+
185 5,@,1,+,@,*,1,+,*
186 3,@,@,*,@,1,+,*,+,@,+
187 5,@,@,*,*,@,2,/,+
188 3,@,@,*,@,2,-,*,*,1,-
189 3,@,@,*,@,2,-,*,*
190 5,@,+,@,@,+,1,-,*
191 3,@,1,+,@,+,@,*,*,1,-
192 3,@,1,+,@,+,@,*,*
193 3,@,1,+,@,+,@,*,*,1,+
194 2,@,@,+,@,*,-,@,*,2,-
195 2,@,@,+,@,*,-,@,*,1,-
196 2,@,@,+,@,*,-,@,*
197 2,@,@,+,@,*,-,@,*,1,+
198 3,@,*,@,2,+,*,@,+
199 5,@,+,@,*,@,+,1,-
200 5,@,+,@,*,@,+
201 5,@,+,@,*,@,+,1,+
202 5,@,+,@,*,1,+,@,+
203 7,@,@,+,@,*,+
204 3,@,+,@,@,*,2,-,*
205 5,@,@,+,@,*,@,+,+
206 3,@,@,*,1,+,@,*,+,@,+
207 4,@,*,@,3,-,*,1,-
208 4,@,*,@,3,-,*
209 3,@,+,@,@,*,1,-,*,1,-
210 3,@,+,@,@,*,1,-,*
211 3,@,+,@,@,*,1,-,*,1,+
212 4,@,*,@,2,-,@,*,+
213 3,@,+,@,@,*,*,3,-
214 3,@,+,@,@,*,*,2,-
215 3,@,+,@,@,*,*,1,-
216 3,@,+,@,@,*,*
217 3,@,+,@,@,*,*,1,+
218 3,@,+,@,@,*,*,2,+
219 3,@,@,+,@,@,*,*,+
220 5,@,+,@,1,+,*,@,+
221 3,@,+,@,@,*,1,+,*,1,-
222 3,@,+,@,@,*,1,+,*
223 4,@,*,1,-,@,*,2,-
224 4,@,*,@,2,-,*
225 4,@,*,1,-,@,*
226 4,@,*,1,-,@,*,1,+
227 2,@,@,+,@,*,1,-,@,*,+
228 3,@,+,@,@,*,2,+,*
229 4,@,@,*,1,-,@,*,+
230 5,@,3,*,@,*,+
231 3,@,@,*,@,*,4,-,*
232 4,@,@,1,+,@,*,+,*,@,+
233 4,@,+,@,@,+,1,-,@,*,+
234 3,@,@,@,+,@,*,+,*,@,+
235 1,3,@,@,@,*,@,*,-,*,-
236 3,@,@,*,@,*,2,-,*,1,-
237 3,@,@,*,@,*,2,-,*
238 4,@,*,@,1,-,*,2,-
239 4,@,*,@,1,-,*,1,-
240 4,@,*,@,1,-,*
241 3,@,@,*,@,*,*,2,-
242 3,@,@,*,@,*,*,1,-
243 3,@,@,*,@,*,*
244 3,@,@,*,@,*,*,1,+
245 3,@,@,*,@,*,*,2,+
246 3,@,@,*,@,*,1,+,*
247 3,@,@,*,@,*,1,+,*,1,+
248 5,@,@,*,*,1,-,@,+
249 3,@,@,*,@,*,2,+,*
250 5,@,@,*,*,@,+
251 5,@,@,*,*,@,+,1,+
252 3,@,+,@,@,1,+,*,*
253 4,@,*,@,*,3,-
254 4,@,*,@,*,2,-
255 4,@,*,@,*,1,-
256 4,@,*,@,*

Note that the highest-cost number is 177, at a cost of 15. Also note that all of the programs use constants at 7 or lower (for example, 49 is generated using 7@*), even though they were permitted to use constants up to 20.

With some logical reasoning, you can discover the following exciting result:

Result. These are the optimal-cost programs for generating the numbers 1...256 when the set of constants comes from [1..7] and even if we extend this set to include any larger set of constants (!).

Here's the proof if you're interested. First, including expensive constants doesn't change the optimal program. Suppose that, using a certain set of constants, you can generate a number $n$ using an optimally-cheap program that costs $c$. Then that program is still optimally cheap if you extend your set of constants to include all numbers $c$ or greater. After all, your old program is still valid, and using any of the new constants costs more than the entire old program. So the new constants won't get used, and your old program is still optimal.

Second, you can leave out large constants you didn't use. If your set of constants is some range $1\ldots k$ but the biggest constant used by your program is some number $k^\prime$, then the program is still optimally cheap if you limit the constants to a maximum of $\ell$, where $\ell$ is any number in the range between $k^\prime$ and $k$.

Third, if your program costs less than the largest constant in your range, then even constants in your range are "large, expensive constants". In such a case, we can combine the first two results: Suppose you have an optimal program for generating a number $n$ using constants up to $k$, and the cost of that program is $c$, and the biggest constant used by your program is $k^\prime$, and moreover (!) $c < k$. Then because of the first result, the program is still optimally-costed if you extend the original set of constants $[1,k]$ to include $[c,\infty)$ (or any subset thereof). Because of the assumption that $c<k$, this means that the program is still optimally costed if the set of constants is the range $[1,\ell]$ for any $\ell \geq k$. Because of the second result, the program is still optimally costed if the set of constants is in the range $[1,\ell^\prime]$ for any $\ell^\prime \geq k^\prime$.

In our particular case, when we generate all the numbers between 1...256 using constants in the range 1...20, we yield programs that cost no more than 15 and use constants only up to 7. Because 15 < 20, these are the optimal programs when our constants are in the range [1...7], and even when we extend our constants to any larger set (!).


Edit: Python code https://pastebin.com/9kiLL3x5


Edit #2: I re-ran the search, this time preferring, as a tiebreaker, programs that use smaller numbers. I realized that all of the numbers between 1...256 can be generated using only numbers in the range 1..4, and that these programs are (almost!) optimally-costed for any greater range.

This optimization is possible partly because we can always make the following same-cost substitutions:

  • 5 = 1,2,@,+,+ (almost same-cost)
  • 6 = 3,@,+
  • 7 = 1,3,@,+,+

And so we can eliminate all previous instances of 5,6,7 without affecting the path cost, the only exception being for the programs that used the value 5.

1       1
2       2
3       3
4       2,@,+
5       1,2,@,+,+
6       3,@,+
7       1,3,@,+,+
8       2,@,@,+,*
9       3,@,*
10      1,3,@,*,+
11      2,@,1,+,@,*,+
12      3,@,1,+,*
13      1,3,@,1,+,*,+
14      2,@,+,@,*,2,-
15      2,@,+,@,*,1,-
16      2,@,+,@,*
17      1,2,@,+,@,*,+
18      3,@,@,+,*
19      1,3,@,@,+,*,+
20      2,@,+,@,1,+,*
21      3,@,@,1,+,+,*
22      2,@,@,1,+,@,*,+,*
23      2,@,+,@,2,+,*,1,-
24      2,@,+,@,2,+,*
25      1,2,@,+,+,@,*
26      3,@,@,*,*,1,-
27      3,@,@,*,*
28      1,3,@,@,*,*,+
29      2,@,1,+,@,@,*,*,+
30      3,@,@,*,1,+,*
31      2,@,+,@,*,@,1,-,+
32      2,@,@,+,@,*,*
33      1,2,@,@,+,@,*,*,+
34      2,@,@,+,@,*,1,+,*
35      3,@,+,@,*,1,-
36      3,@,+,@,*
37      1,3,@,+,@,*,+
38      2,@,@,1,+,*,@,*,+
39      3,@,@,+,@,*,+
40      2,@,@,+,@,1,+,*,*
41      3,@,+,@,@,*,1,-,+
42      3,@,+,@,1,+,*
43      1,3,@,+,@,1,+,*,+
44      2,@,@,@,1,+,@,*,+,*,*
45      3,@,@,2,+,*,*
46      2,@,+,@,*,@,1,-,@,+,+
47      3,@,1,+,@,*,*,1,-
48      3,@,1,+,@,*,*
49      1,3,@,+,+,@,*
50      2,@,@,1,+,+,@,*,*
51      3,@,@,*,@,1,-,+,*
52      3,@,@,1,+,+,@,*,+
53      3,@,@,*,*,@,1,-,+
54      3,@,@,@,+,*,*
55      1,3,@,@,@,+,*,*,+
56      2,@,@,+,*,@,1,-,*
57      3,@,@,1,+,@,*,+,*
58      2,@,@,@,+,*,@,@,*,-,-
59      2,@,+,@,@,*,1,-,*,1,-
60      2,@,+,@,@,*,1,-,*
61      1,2,@,+,@,@,@,*,*,-,-
62      2,@,+,@,@,*,*,2,-
63      2,@,+,@,@,*,*,1,-
64      2,@,+,@,@,*,*
65      1,2,@,+,@,@,*,*,+
66      2,@,@,+,@,@,*,*,+
67      3,@,1,+,@,@,*,*,+
68      2,@,+,@,@,*,1,+,*
69      1,2,@,+,@,@,*,1,+,*,+
70      3,@,+,@,*,1,-,@,+
71      3,@,+,@,*,@,1,-,+
72      3,@,+,@,@,+,*
73      1,3,@,*,@,@,*,-,-
74      1,3,@,+,@,*,+,@,+
75      3,@,2,+,@,*,*
76      2,@,@,@,1,+,*,@,*,+,*
77      3,@,*,@,*,2,@,+,-
78      3,@,@,@,*,*,1,-,*
79      3,@,*,@,*,2,-
80      3,@,*,@,*,1,-
81      3,@,*,@,*
82      1,3,@,*,@,*,+
83      2,@,1,+,@,*,@,*,+
84      3,@,@,*,@,*,+
85      1,3,@,@,*,@,*,+,+
86      2,@,1,+,@,@,*,@,*,+,+
87      3,@,@,@,*,@,*,+,+
88      2,@,@,+,*,@,3,+,*
89      3,@,*,@,@,*,1,-,+
90      3,@,*,@,1,+,*
91      1,3,@,*,@,1,+,*,+
92      2,@,1,+,@,*,@,1,+,*,+
93      3,@,@,*,@,1,+,*,+
94      1,3,@,@,*,@,1,+,*,+,+
95      2,@,+,@,@,2,+,*,*,1,-
96      2,@,+,@,@,2,+,*,*
97      1,2,@,+,@,@,2,+,*,*,+
98      1,3,@,+,+,@,@,+,*
99      3,@,*,@,2,+,*
100     1,3,@,*,+,@,*
101     1,@,3,@,*,+,@,*,+
102     2,@,@,+,@,1,+,@,*,*,+
103     3,@,@,*,1,+,@,*,+
104     2,@,+,@,1,+,@,*,1,+,*
105     3,@,@,+,@,*,1,-,*
106     3,@,@,+,@,*,*,2,-
107     3,@,@,+,@,*,*,1,-
108     3,@,@,+,@,*,*
109     1,3,@,@,+,@,*,*,+
110     1,3,@,*,+,@,1,+,*
111     3,@,@,+,@,*,1,+,*
112     2,@,@,@,+,*,@,1,-,*,*
113     3,@,@,*,1,+,@,1,+,*,+
114     3,@,@,+,@,*,2,+,*
115     1,2,@,+,+,@,@,*,2,-,*
116     2,@,+,@,@,1,+,@,*,+,*
117     3,@,@,@,+,@,*,+,*
118     1,3,@,@,@,+,@,*,+,*,+
119     2,@,1,+,@,*,+,@,*,2,-
120     2,@,@,+,@,@,*,1,-,*,*
121     2,@,1,+,@,*,+,@,*
122     1,2,@,1,+,@,*,+,@,*,+
123     2,@,@,1,+,@,*,+,@,*,+
124     2,@,+,@,@,*,@,1,-,+,*
125     1,2,@,+,+,@,@,*,*
126     3,@,@,+,@,1,+,*,*
127     2,@,@,1,+,+,@,@,*,*,+
128     2,@,@,+,@,@,*,*,*
129     1,2,@,@,+,@,@,*,*,*,+
130     2,@,@,+,@,@,*,*,1,+,*
131     3,@,1,+,@,@,@,+,*,*,+
132     3,@,1,+,*,@,1,-,*
133     1,3,@,1,+,*,@,@,*,-,-
134     3,@,1,+,@,@,*,*,+,@,+
135     3,@,@,@,2,+,*,*,*
136     2,@,@,+,@,@,*,1,+,*,*
137     3,@,*,@,1,-,@,@,+,*,+
138     3,@,@,*,-,@,@,+,@,*,%
139     1,3,@,+,@,@,+,@,*,-,-
140     2,@,+,@,1,+,@,2,+,*,*
141     3,@,@,+,@,@,+,@,*,-,-
142     3,@,1,+,*,@,*,2,-
143     3,@,1,+,*,@,*,1,-
144     3,@,1,+,*,@,*
145     1,3,@,1,+,*,@,*,+
146     2,@,@,+,@,@,*,-,@,*,+
147     3,@,@,1,+,+,@,*,*
148     2,@,+,@,@,@,*,-,@,*,+
149     3,@,+,@,@,+,@,*,1,-,+
150     3,@,+,@,1,-,@,*,*
151     1,3,@,+,@,1,-,@,*,*,+
152     3,@,*,@,@,1,-,+,*,1,-
153     3,@,*,@,@,1,-,+,*
154     1,3,@,*,@,@,@,+,*,-,-
155     3,@,1,+,*,@,@,*,1,-,+
156     3,@,1,+,*,@,1,+,*
157     1,3,@,@,*,@,*,-,@,+,-
158     3,@,*,@,*,2,-,@,+
159     3,@,@,@,*,@,*,-,@,+,-
160     3,@,*,@,*,1,-,@,+
161     3,@,*,@,*,@,1,-,+
162     3,@,*,@,@,+,*
163     1,3,@,*,@,@,+,*,+
164     1,3,@,*,@,*,+,@,+
165     3,@,@,*,@,@,+,*,+
166     2,@,@,1,+,@,*,@,*,+,*
167     3,@,@,*,@,*,1,+,@,+,+
168     3,@,@,*,@,*,+,@,+
169     3,@,1,+,@,*,-,@,*
170     1,3,@,1,+,@,*,-,@,*,+
171     3,@,*,@,@,1,+,+,*
172     1,3,@,*,@,@,1,+,+,*,+
173     3,@,*,@,@,*,1,+,@,+,+
174     3,@,@,*,@,@,1,+,+,*,+
175     1,2,@,+,+,@,@,2,+,*,*
176     2,@,+,@,@,@,@,1,-,+,+,*,*
177     2,@,@,1,+,+,@,@,2,+,*,*,+
178     3,@,*,@,@,*,1,-,+,@,+
179     3,@,*,@,1,+,*,@,1,-,+
180     3,@,+,@,@,1,-,*,*
181     1,3,@,+,@,@,@,*,-,*,-
182     2,@,@,+,@,*,-,@,1,+,*
183     3,@,@,+,@,@,@,*,-,*,-
184     2,@,@,1,+,@,*,@,1,+,*,+,*
185     1,2,@,+,+,@,1,+,@,*,1,+,*
186     3,@,+,@,@,@,@,*,-,*,-
187     1,2,@,+,+,@,@,*,*,@,2,/,+
188     2,@,+,@,@,*,@,@,1,-,+,+,*
189     3,@,@,*,@,2,-,*,*
190     1,3,@,*,+,@,@,1,-,+,*
191     3,@,1,+,@,@,*,*,*,1,-
192     3,@,1,+,@,@,*,*,*
193     1,3,@,1,+,@,@,*,*,*,+
194     2,@,@,+,@,*,-,@,*,2,-
195     2,@,@,+,@,*,-,@,*,1,-
196     2,@,@,+,@,*,-,@,*
197     1,2,@,@,+,@,*,-,@,*,+
198     2,@,@,@,+,@,*,-,@,*,+
199     3,@,@,1,+,+,@,+,@,*,+
200     1,3,@,*,+,@,@,+,*
201     1,@,3,@,*,+,@,@,+,*,+
202     3,@,+,@,1,+,@,+,@,*,+
203     3,@,@,*,1,+,@,@,+,*,+
204     3,@,+,@,@,*,2,-,*
205     1,2,@,+,+,@,@,1,+,@,*,+,*
206     3,@,@,*,1,+,@,*,+,@,+
207     3,@,@,+,@,@,*,2,-,*,+
208     2,@,+,@,*,@,3,-,*
209     3,@,*,@,1,+,@,@,+,*,+
210     3,@,+,@,@,*,1,-,*
211     1,3,@,+,@,@,@,*,*,-,-
212     2,@,+,@,*,@,2,-,@,*,+
213     3,@,@,+,@,@,@,*,*,-,-
214     3,@,+,@,@,*,*,2,-
215     3,@,+,@,@,*,*,1,-
216     3,@,+,@,@,*,*
217     1,3,@,+,@,@,*,*,+
218     2,@,@,1,+,*,@,@,*,*,+
219     3,@,@,+,@,@,*,*,+
220     2,@,+,@,2,+,@,@,*,*,+
221     3,@,+,@,@,@,*,*,1,-,+
222     3,@,+,@,@,*,1,+,*
223     2,@,+,@,*,@,2,-,*,1,-
224     2,@,+,@,*,@,2,-,*
225     1,2,@,+,@,*,-,@,*
226     1,@,2,@,+,@,*,-,@,*,+
227     2,@,@,+,@,*,1,-,@,*,+
228     3,@,@,2,+,*,@,*,+
229     2,@,+,@,@,*,1,-,@,*,+
230     1,2,@,+,@,@,*,1,-,@,*,+,+
231     2,@,@,+,@,@,*,1,-,@,*,+,+
232     2,@,@,+,@,@,1,+,@,*,+,*,*
233     2,@,+,@,@,@,*,1,-,@,*,+,+
234     3,@,@,@,@,*,*,1,-,*,*
235     1,3,@,@,@,*,@,*,-,*,-
236     2,@,1,+,@,@,@,*,@,*,-,*,-
237     3,@,@,*,@,*,2,-,*
238     2,@,+,@,*,@,1,-,*,2,-
239     2,@,+,@,*,@,1,-,*,1,-
240     2,@,+,@,*,@,1,-,*
241     1,2,@,+,@,*,@,@,*,-,-
242     3,@,@,*,@,*,*,1,-
243     3,@,@,*,@,*,*
244     1,3,@,@,*,@,*,*,+
245     2,@,1,+,@,@,*,@,*,*,+
246     3,@,@,*,@,*,1,+,*
247     1,3,@,@,*,@,*,1,+,*,+
248     2,@,+,@,@,@,*,*,2,-,*
249     3,@,@,*,@,*,2,+,*
250     2,@,@,1,+,+,@,@,*,*,*
251     3,@,@,@,*,@,*,+,*,1,-
252     3,@,@,@,*,@,*,+,*
253     1,2,@,+,@,@,*,@,*,-,-
254     2,@,+,@,*,@,*,2,-
255     2,@,+,@,*,@,*,1,-
256     2,@,+,@,*,@,*


Some notation: if $P$ is a program, let $|P|$ denote its cost.

We have the following two fantastic results:

  1. Theorem: Every $n$ can be generated using a program which costs at most n and which uses only the constants 1...5.

  2. Corollary: For any $n$, there is an optimal program which computes $n$ uses only the constants 1...5. This program is absolutely the cheapest possible: every other program which produces $n$, no matter what constants used, costs at least as much.

Let's first show that if the theorem is true, the corollary follows. Let $P_n$ denote an optimal-cost program for generating $n$ under the restriction that $P_n$ can only use the constants 1...5, and let $Q_n$ denote an optimal-cost program for generating $n$ when there is no restriction on the constants. Suppose $Q_n$ uses a constant $k>5$. Then we can make a new program $Q^\prime_n$ which is like $Q_n$ except all instances of $k$ have been replaced with the program $P_k$ (which, by definition, uses only the numbers 1...5). According to the theorem, $|P_k| \leq k$; hence this substitution $k\mapsto P_k$ can't increase the program's cost. Hence by repeating this procedure until no more constants $k>5$ remain, we arrive at a program $Q_n^*$ which costs no more than $Q_n$ but which only uses constants in the range 1...5. By optimality, it follows that $|P_n| = |Q_n|$, so $P_n$ is the cheapest program for generating $n$ when only the constants 1...5 are allowed, and also when any additional constants are allowed. Because diminishing the set of constants can also only increase costs, it follows that $P_n$ is the cheapest program for generating $n$ regardless of which constants are allowed.

So the theorem implies the corollary. Now let's prove the theorem.

Proof of theorem. As before, let $P_n$ denote the optimal program for generating $n$ subject to the restriction that it only use constants 1...5.

We'll prove that this theorem holds for all $n$ by dividing the integers into particular intervals and proving the theorem in each interval.

  1. Define $A\equiv [1\ldots 34]$
  2. Define $B_0 \equiv [34 \ldots 256]$.
  3. Define $B_N \equiv [256 \cdot 2^{N-1} \ldots 256 \cdot 2^{N}]$ for all $N\geq 1$.

First, the theorem holds in region $A$, as our computer-provided evidence showed.

Next we'll prove, by induction, a doubly-strong result about how fast the programs grow in $B_i$; this will establish the theorem a fortiori :

Hypothesis(N) : If $n$ is in $B_N$, then not only is $|P_n| < n$, but moreover $|P_n| \lneqq \frac{n}{2}-1$.

Base case: Evidently the theorem holds in $B_0$; $B_0$ consists of numbers between 34 and 256, and our computer-provided evidence shows that $|P_n| \leq 15 \lneqq n/2 - 1$.

Induction step: If the theorem holds in all $B_i$ $(i=0\ldots N)$, let $n$ be an integer in $B_{N+1}$. Note that $n$ is large enough that we can always write $n = u + v$ for two integers $u$ and $v$ not in $A=[1...34]$. Hence $u$ and $v$ must be in some of the $B_i$. But by the induction hypothesis, the theorem then holds for $u$ and $v$.

Let's consider the following (possibly suboptimal) program for constructing $n$:

$$\widetilde P_n \equiv \langle P_u, P_v, +\rangle$$

This program first puts $u$ in memory using $P_u$, then puts $v$ in memory using $P_v$, then adds the results together, yielding $n = u+v$.

Considering its components, this program uses only the constants 1...5. By the induction hypothesis, the cost of this program is strictly less than $(\frac{u}{2}-1) + (\frac{v}{2}-1) + 1$. Hence the cost of this program is strictly less than $\frac{n}{2} - 1$. This shows that every $n$ in $B_{N+1}$ can be generated using a program that uses only the constants 1...5 and costs at most $\frac{n}{2}-1$, establishing the induction hypothesis.

This completes the proof of the theorem; we conclude that every $n$ can be generated using a program which costs at most $n$ and which uses only the constants 1...5 □.

One final comment:

Comment: The range 1...5 is not arbitrary; in fact, it's the smallest range $1...k$ for which the theorem holds. Indeed, if you shrink the range to anything smaller like 1...4, then it's no longer true that every $n$ can be generated using a program which costs at most $n$; indeed, in that case there is no program which generates 5 more cheaply than [1,2,@,+,+], which costs 6.


Now I am trying to establish bounded relationships between the length of a program, the cost of a program, and the size of the number it produces.

As for length, I note that every program must start with a number. I also note that after the first number, every subsequent increment in stack length (using @ or another number) must be paired with a decrement in stack length (using +, *, etc.). Hence every number-generating program has odd length, and is recognized by the context-free language:

$$\begin{align*} S & \mapsto K \,|\, K T \\ K & \mapsto 1 \,|\, 2 \,|\, 3 \,|\, 4 \,|\, 5 \\ T & \mapsto PQ \,|\, PTQ \,|\, TT \\ P & \mapsto K \,|\, @\\ Q & \mapsto + \,|\, - \,|\, \times \,|\, \div \,|\, \% \\ \end{align*}$$

So intuitively, it seems like if you specify a(n odd) program length $n=2k+1$, the largest number you can generate using an optimal program of length $n$ is $$5 @ * @ * \ldots @ * = 5^{(2^k)}$$ with a corresponding cost of $5 + 2k = (n + 4)$.

This does not mean that the most expensive optimal program of length $n$ costs $n+4$; rather the highest-result optimal program of length $n$ costs $n+4$.

As for how much an optimal program of length $n=2k+1$ can cost, we know that it must be at least $n$ (when each operation costs 1) and certainly no more than the obviously suboptimal program $[\underbrace{5,\cdots, 5}_{k+1}, \underbrace{+, \cdots, +}_k] = 5(k+1) + k = 6k+5 = 3n+2$. (It's suboptimal because it doesn't use @.)