Blackjack problem, prediction problem, and an issue with usable ace

73 Views Asked by At

I have an issue with the way a usable ace should be interpreted in the the Blackjack problem and I appreciate your input.

Game overview:

  • The game is played between a player and a dealer.
  • The objective is to outscore the dealer (via sum of dealt cards) without going bust (exceeding 21).
  • All face cards count as 10, and the ace can count either as 1 or as 11.
  • If the player holds an ace that he could count as 11 without going bust, then the ace is said to be usable.
  • Cards are dealt from an infinite deck (i.e., with replacement)

How it's played:

The game begins with two cards dealt to both dealer and player. One of the dealer’s cards is face up and the other is face down. If the player has 21 immediately (an ace and a 10-card), it is called a natural. He then wins unless the dealer also has a natural, in which case the game is a draw. If the player does not have a natural, then he can request additional cards, one by one (hits), until he either stops (sticks) or exceeds 21 (goes bust). If he goes bust, he loses; if he sticks, then it becomes the dealer’s turn. The dealer hits or sticks according to a fixed strategy without choice: he sticks on any sum of 17 or greater, and hits otherwise. If the dealer goes bust, then the player wins; otherwise, the outcome—win, lose, or draw—is determined by whose final sum is closer to 21.

Using First-visit MC prediction, I would like to evaluate the policy that sticks if the player’s sum is 20 or 21, and otherwise hits. The state variables are the player makes decisions on the basis of three variables: his current sum (12–21), the dealer’s one showing card (ace–10), and whether or not he holds a usable ace. This makes for a total of 200 states.

I am approximating the state-value functions but I see an issue with how I should interpret the usable ace. Let's consider an episode in which the two cards that are dealt to the player are and ace and 2. The players total score is 13 and the player has a usable ace. Following the policy, the player should hit. The third card is dealt and it's another ace. The player's total score is now 14 because the second ace is not usable. Now the question is, should I treat this as a case with a usable ace or the case with no usable ace?

1

There are 1 best solutions below

0
On

A hand consisting of a deuce and two aces is a (soft) $14$ with a usable ace because it contains an ace that can be counted as $11$ without busting. In an infinite deck setting, it is equivalent to ace-three.