Maze with a randomly moving obstacle

338 Views Asked by At

There is a 8*8 maze, and a player starting at (0, 0) wants to arrive at (7, 7) in minimal moves.

For each round (a round may contain several moves), the obstacle will be randomly set at a grid it has not been set before, also, it won't be set at either (0, 0) or (7, 7).

After the obstacle is set (although the player has no idea where it is in this round), the player has to give a probability distribution of his move in this round, e.g. up: 0.5, down: 0.2, left: 0.2, right: 0.1, then his move will be generated depending on the distribution. If he bumps into the obstacle or gets out of the maze (e.g. move right when he is at (7, 0)), he has to stay at the current grid, and another move will be generated. The process goes on until he moves to a feasible grid.

When a round ends, the player will know where the obstacle was in the last round. To be clear, supposed the player is in the nth round, then he will know the history of the obstacle from the first round to the (n-1)th round.

My question is, given the history of the obstacle and the current position of the player, is there a strategy to minimized expected moves from a probabilistic perspective?

EDIT

I've wrote a simulation program, and tried several strategies with it. Define a strategy consisting of four real values, representing the probability of moving up, down, left, and right, respectively.The naive one [0.25, 0.25, 0.25, 0.25] got an average of 73 moves.

Another strategy is defined below:

  1. if up and right are both safe, return [0.5, 0, 0, 0.5]

  2. if up is safe, return [1, 0, 0, 0]; if right is safe, return [0, 0, 0, 1]

  3. if at the right most line, return [0.5, 0, 0.5, 0]; if at the up most line, return [0, 0.5, 0, 0.5]

  4. else, return [0.5, 0, 0.5, 0]

In the end, the second strategy need 18.4573 moves on average. (10000 trials)

It is a huge success when comparing with the naive one, but I am struggling with getting closer to the theoretically minimum of 14 moves. I appreciate any advice or suggestions.

Confirmation

  • There is a $(X,Y)=(0:N-1,0:N-1) \in \mathbb{R}^2, N=8$ maze

    Yes

  • There is no inner walls or paths at the maze (Then it is not a maze??),

    Yes, I called it a maze only because there is a single randomly appearing obstacle.

  • The positions cannot get outside the domain, hence evidently the movement have to be properly constrained,

    Yes, if the intended move generated by the probability distribution will make the player get out of the maze, the count of moves will increment by one, and the player has to stay at the current position until an feasible move is generated.

  • There is an changing obstacle at an unknown position $(x^o,y^o)$, for every time $t$,

    Yes

  • The obstacle position at $t=n-1$ is known at $t=n$,

    Yes

  • For every time $n$, the obstacle change their coordinates according an uniform distribution on $(X,Y)$. Hence, the obstacle can "jump" to their new position,

    Yes

  • The obstacle don't take a previous coordinate. Hence the uniform distribution is over $(X,Y)-(X^o,Y^o)$, $(X^o,Y^o)={(x^o(t),y^o(t),t=1...n-1)}$. Hence the obstacle takes its last move at $t=N^2$ and then it disappears (??),

    The obstacle won't be set on the stating point and the ending point, so the player will lose the game if he can not reach the exit in $N^2 - 2$ rounds

  • The player position $(x(t),y(t))$ one square (up, down, left or right) per time,

    Yes

  • The player cannot pass, hence make no move at all (??),

    I am not sure what you are referring.

  • If the obstacle hits the player by a player move or an obstacle change, the player loses,

    If the obstacle hits the player by a player move, as I stated in my problem, If he bumps into the obstacle or gets out of the maze (e.g. move right when he is at (7, 0)), he has to stay at the current grid, and another move will be generated. The process goes on until he moves to a feasible grid. In short, the player loses the game iff he can not reach the exit in $N^2 - 2$ rounds. If the obstacle hits the player by an obstacle change, we assume that the player is strong enough to lift the obstacle, so he can safely stay with the obstacle in the same grid.

  • If the player reach the exit, the player wins.

    Yes

1

There are 1 best solutions below

9
On

Lets consider the following assumptions, which you should confirm for proceeding further:

  • There is a $(X,Y)=(0:N-1,0:N-1) \in \mathbb{R}^2, N=8$ maze,
  • There is no inner walls or paths at the maze (Then it is not a maze??),
  • The positions cannot get outside the domain, hence evidently the movement have to be properly constrained,
  • There is an changing obstacle at an unknown position $(x^o,y^o)$, for every time $t$,
  • The obstacle position at $t=n-1$ is known at $t=n$,
  • For every time $n$, the obstacle change their coordinates according an uniform distribution on $(X,Y)$. Hence, the obstacle can "jump" to their new position,
  • The obstacle don't take a previous coordinate. Hence the uniform distribution is over $(X,Y)-(X^o,Y^o)$, $(X^o,Y^o)={(x^o(t),y^o(t),t=1...n-1)}$. Hence the obstacle takes its last move at $t=N^2$ and then it disappears (??),
  • The player position $(x(t),y(t))$ one square (up, down, left or right) per time,
  • The player cannot pass, hence make no move at all (??),
  • If the obstacle hits the player by a player move or an obstacle change, the player loses,
  • If the player reach the exit, the player wins.

Please confirm.

If the obstacle moves are indeed uniform, the optimal strategy is to move towards the closest position in the set $(X^o,Y^o)$, or better, to the closest position made by two joined squares taken by the obstacle (for example (1,1) and (1,2) if they happen), and stay moving inside there, until the obstacle disappears, hence ensuring victory, taking in average $N^2+4=68$ moves

If you are presenting this problem onto a contest, for winning against the fastest strategy, then the probability of being catched and move onto a closest position used by the obstacle $(X^o,Y^o)$ must equal your risk appetite to win the contest, making the algorithm faster but sub-optimal.

The strategy taking the major risk (willingly or not) will win, hence the fastest, riskiest strategy is just move straight to the end, taking $2(N-1)$=14 moves.