Blackjack#
This environment is part of the Toy Text environments. Please read that page first for general information.
Action Space 
Discrete(2) 
Observation Space 
Tuple(Discrete(32), Discrete(11), Discrete(2)) 
Import 

Blackjack is a card game where the goal is to beat the dealer by obtaining cards that sum to closer to 21 (without going over 21) than the dealers cards.
Description#
Card Values:
Face cards (Jack, Queen, King) have a point value of 10.
Aces can either count as 11 (called a ‘usable ace’) or 1.
Numerical cards (29) have a value equal to their number.
This game is played with an infinite deck (or with replacement). The game starts with the dealer having one face up and one face down card, while the player has two face up cards.
The player can request additional cards (hit, action=1) until they decide to stop (stick, action=0) or exceed 21 (bust, immediate loss). After the player sticks, the dealer reveals their facedown card, and draws until their sum is 17 or greater. If the dealer goes bust, the player wins. If neither the player nor the dealer busts, the outcome (win, lose, draw) is decided by whose sum is closer to 21.
Action Space#
There are two actions: stick (0), and hit (1).
Observation Space#
The observation consists of a 3tuple containing: the player’s current sum, the value of the dealer’s one showing card (110 where 1 is ace), and whether the player holds a usable ace (0 or 1).
This environment corresponds to the version of the blackjack problem described in Example 5.1 in Reinforcement Learning: An Introduction by Sutton and Barto (http://incompleteideas.net/book/thebook2nd.html).
Rewards#
Arguments#
gym.make('Blackjackv1', natural=False, sab=False)
natural=False
: Whether to give an additional reward for
starting with a natural blackjack, i.e. starting with an ace and ten (sum is 21).
sab=False
: Whether to follow the exact rules outlined in the book by
Sutton and Barto. If sab
is True
, the keyword argument natural
will be ignored.
If the player achieves a natural blackjack and the dealer does not, the player
will win (i.e. get a reward of +1). The reverse rule does not apply.
If both the player and the dealer get a natural, it will be a draw (i.e. reward 0).
Version History#
v0: Initial versions release (1.0.0)