Probability that random moves in the game 2048 will win

37.5k Views Asked by At

I have recently played the game 2048, created by Gabriele Cirulli, which is fun. I suggest trying if you have not. But my brother posed this question to me about the game:

If he were to write a script that made random moves in the game 2048, what is the probability that it would win the game?

Combinatorics is not my area, so I did not even attempt to answer this, knowing this seems like a difficult question to answer. But I thought someone here might have a good idea.

Also, since we are not concerned with time, just with winning, we can assume that every random move actually results in a tile moving.

Addendum

While the answers below shed light on the problem, only BoZenKhaa came close to providing a probability, even if it was an upper bound. So I would like to modify the question to:

Can we find decent upper and lower bounds for this probability?

8

There are 8 best solutions below

6
On

The mere fact that the game is difficult to win regardless suggests that it's highly unlikely to win using random moves. I'm still not totally clear on how the game is played, but the chances of winning through random moves is likely extremely low (as in, so low you would never expect to observe it within a reasonable time--like the age of the universe).

6
On

Instead of trying to get an exact answer, let me give you a very rough intuition based estimate based on few observations about the game and a related question on SO:

  • Before you make it to the 2048 tile, you will need to have at least 10 tiles of different values on the board: $2,4,8,16,32,64,128,256,512$ and $1024$.
  • You will have to make at least $520$ moves to get to the 2048 tile (each time you make a move, the sum of tiles on the board increases by at most 4).
  • This is the awesome post on SO concerning the same game: https://stackoverflow.com/questions/22342854/what-is-the-optimal-algorithm-for-the-game-2048 . It is noteworthy that the best algorithm mentioned in one of the answers has claimed success rate of around 90%, i.e. it does not get to win every time.
  • In the abovementioned post, it is suggested that a good winning strategy is to select one side, say the top one, and then try not to ever move your highest number away from that side. It is also suggested that if you have to move the high numbers away from this side of choice, it can be hard to save the day and still win.

Now for the sake of giving a pseudo-rudimentary estimate, let us entertain the idea that the last bullet point is right about the winning strategy and that this strategy covers most of the winning strategies.

Next, imagine, that our Random AlgoriThm (RAT) made it to the stage where half the board is covered with different numbers, meaning there are 8 different numbers on board $2,4,8,16,32,64,128, 256$. This means we are on the move number at most around $256 = \frac{1}{2}{\sum_1^{8} 2^k}$.

Also, our RAT miraculously made it this far and managed to keep it's high numbers by the top side of the board, as by the last bullet point. For the final assumption, assume that if the RAT presses the bottom arrow, it will always loose the game (because it is so random, it will not be able to salvage the situation.)

Now, the chance of our RAT winning after the move 256 is for sure smaller than the chance of RAT not pressing the bottom arrow. There is $3/4$ chance of the RAT not pressing the down arrow on every move, and there are at least 256 moves to be made before the RAT can get the 2048 tile. Thus the chance of the RAT winning in our simplified scenario is smaller $\frac{3}{4}^{256} \leq \frac{1}{2^{32}}$.

$P=\frac{1}{2^{32}}$ makes for a rather rare occurencce. As by N.Owad's comment, this chance is MUCH smaller than picking one specific second since the beginning of the universe. This should give you some intuition as to how unlikely a random win in this game actually is.

Disclaimer: I do not pretend that P is a bound of any sorts for the actual probability of random win due to the nature of simplifications made. It just tries to illustrates a number which which is likely larger than a chance of randomly winning.

10
On

I implemented a simulation of 2048, because I wanted to analyze different strategies.

Unsurprisingly, the result is that moving at random is a really bad strategy. enter image description here Above you can see the scores of $1000000$ random games (edit: updated after bugfix, thanks to misof). The score is defined as the sum of all numbers generated by merge. It can be viewed as a measure of how far you make it in the game. For a win you need a score of at least $16384$. You can see that most games end in a region below $2000$, that is they generate at most a 128-tile and loose subsequently. The heap on the right at $2500$ represents those games that manage to generate a 256 tile - those games are rather rare. No game made it to the 1024-tile.

Upon request, here is the plot for highest number on a tile: enter image description here When it comes to "dumb strategies", you get better results cycling moves deterministically: move up, right, up, left and repeat. This improves the expected highest number by one tile.

You can do your own experiments using the code here and here.

1
On

Unfortunately I don't have the time to do this right now, but I still want to through this idea of how to tackel this (very hard) problem out there.

This method somewhat resembles what is often used in statistical physics. Cf as well to ergodic theory and the likes (we would have to prove ergodicity.... but well...).

For every field store an array of probabilities that this particular field is in the state 0 (unoccupied), 1 (has the number $2^1$), 2 (number $2^2$), ..., 11 (winning field $2^{11} = 2048$).

Let $P(i,j,x)$ denote the chance that the field with coordinates $(i,j)$ is in state $0\le x \le 11$.

Each move (up, left, right, down) has a clearly defined effect that can be expressed as a set of rules for example $$P'(i,j,x) = \underbrace{P(i,j,x)}_{\text{already was in this state}} - \underbrace{P(i,j,x) * (P(i+1,j,x)+P(i+1,j,x))}_{\text{leaves this state due to a join or move to the right}} + \underbrace{P(i,j,x-1)*P(i-1,j,x-1)}_{\text{joines this state due to join from the left}} + \dots $$ where $\dots$ stands for the more lengthy terms due to moving tiles (eg. somewhere to the right is an empty tile and left of me was the value $x$ in the last step).

The insertion of random numbers will simply decrease the likelihood that each field is unoccupied and respectively increase the state's $x=1$ and $x=2$ chances $$ P'(i,j,1) =P(i,j,1) + P(i,j,0)/16 \\ P'(i,j,2)=P(i,j,2) + P(i,j,0)/16 \\ P'(i,j,0)=P(i,j,0)*\frac{14}{16}\,. $$

Any move and the random insertion step are alternated on our current state (the vectors of probabilities). The likelihood of loosing in the current step is equal to the likelihood of all fields being occupied before the random insertions step. The likelihood of winning is the likelihood of any field being in state $x=11$ after the chosen move.*

Clearly the probabilities should be correlated. Assuming that they are not is somewhat equal to the "molecular" chaos that is assumed in statistical physics / ergodic theory. But, assuming that we are indeed ergodic with this description of the model, we can get the chance of winning the game after $n$ predefined steps (and not loosing it before) by iterating this $n$ times. This way one could compare different strategies easily, but would still have to test several random chains of moves to get a decent average. (We only implicitly averaged over all possible positions of the inserted $2$ and $4$ fields)


(*) Note that we have to remove any winning states from our vector of possibilities before each random insertion. Clearly we did not win yet if we are still playing. (Also this is necessary to have any chance at being ergodic)

1
On

(Would post this as a comment, not a separate answer, but, alas, as a new user I don't have the reputation. Blame the spambots :) Also, note that the content of this answer is partially a repost of my post on Quora a few days ago.)

The answer to the question being asked: As already suggested in other answers, the probability of winning the game by making random moves is essentially zero.

Some more details:

The stats in the answer by @benh are incorrect. There is at least one bug I know of in his code, and it causes the stats he posted to be lower than they should. The bug I see is in how he detects the end of the game. In his simulations he ended the game as soon as the board becomes full. However, in 2048 the game is only over once that happens and no two adjacent tiles share the same number.

Below are the results of my simulations with a proper terminating condition.

Largest tile stats for 5,000,000 games, each move chosen uniformly at random:

  8:      78
 16:   13300
 32:  338969
 64: 1872573
128: 2382743
256:  391386
512:     951

Expected value of the largest tile at the end of the game: ~107.32

link to my code

0
On

I have been running random simulations on my computer, and it won't reach 1024, let alone 2048, as benh showed. So I eliminated one of the keys, and it reached 1024 in ~5-6 million games. I am now trying to see how many moves it takes for it to reach different numbers in different sized grids. Turns out even 5x5 isn't much better, as it doesn't reach even 2048 easily.

0
On

I wrote JS AIs with build-in UI statistic reports. Check http://js-2048-ai.sourceforge.net/2048.html in your browser (I recommend V8 or SpiderMonkey JS engine based browser like Chrome of Firefox).

For 50000 simulation runs of blind random:

2^9 0.0201 %
2^8 7.93 %
2^7 47 %
2^6 37.7 %
2^5 7.07 %
2^4 0.251 %

With shifted probability of random moves (left - 1 weight, right - 16 weight, up - 4 weight, down - 8 weight) this result better:

2^9 0.221 %
2^8 15.2 %
2^7 50.1 %
2^6 29.9 %
2^5 4.46 %
2^4 0.132 %

I don't run Monte-Carlo simulation to find better coefficient proportion, above values from manual experiments...

Another dumb blind strategy - moving by cycling direction. In clock order you get better results then if use random blind strategy:

2^10  0.00394 %
2^9   6.25 %
2^8   50.4 %
2^7   37.2 %
2^6   5.98 %
2^5   0.177 %
0
On

$3.3\times10^{-32}$

This result has been achieved using 5 runs of branching simulation in the following configuration: Run 16777216 simultaneous game simulations. Whenever less than half of them remain make a clone of every remaining simulation. Weigh the result of every simulation relative to what "generation" it end in, so simulations ending before the first branch will count for $1/16777216$. After one branch it is $1/33554432$, after two $1/67108864$ and so forth.

This method ensures that every original simulation has the same weight in the final result, while the simulation keeps running long past the point where standard simulations would have stopped.

It should be obvious that given enough runs of such a simulation, the average result will converge towards the answer. The less obvious part is that it does so much faster than a standard simulation. This is only possible due to nature of the game and the question asked, the following are crucial properties of the question:

  • We are searching for a very low probability.
  • The variation in probability for each node of the simulation to succeed is limited, i.e. the game has no "super states" where the probability of success is abnormally large relative to the average game state at the same level of progress.
  • There is an easily identifiable failure state that can be used for pruning the collection of simulations.

With these constraints in mind, the game and question seem perfect for the solution method. While there is significant variation in probability of success between nodes, it is not so great that a few nodes can dominate in a collection of several million.

So far the only means for estimating the precision of the result that I have is comparing the results of different runs. The results of 5 runs:

$$3.29\times10^{-32}$$ $$3.43\times10^{-32}$$ $$3.33\times10^{-32}$$ $$3.30\times10^{-32}$$ $$3.31\times10^{-32}$$

I'll try to get some source code online tomorrow.