Summary

This repository implements an algorithm to optimally playing Sette e mezzo. The algorithm uses policy iteration to identify the best choice to make in the game state. The two possible choices are hit, meaning to draw the next card, and stick, meaning to stop. The algorithm computes the best strategy by assuming the opponent player keeps playing up to a certain score (which is set to 5 and can be specified by the user) is reached, independently of the player sum.

Rules

Sette e mezzo is played with a 40-card deck, a standard deck with eights, nines, and tens removed. The value of cards ace through seven is their pip value (1 through 7), face cards are worth 0.5 point each.
At the very beginning of the match both players draw a card.
Then, the first player repeatedly picks cards from the deck until either reaches a scored greater than 7.5 (in this is case is boosted and immediately loses the game) or he sticks, reaching a certain score defined by the sum of the values of his cards that is not releaved to the opponent.
Then, the opponent player repeats the process by deciding whether to draw a card or stick at each step. In this variant of the game, he sticks as soon as he reaches a score greater than or equal to a certain limit (which is set to 5 and can be specified by the user).
At the end of the game, the players compare their scores. The player wins when his score is greater than the opponent's and loses if his score is smaller. If the scores equal there is a tie.

Solving the game

The algorithm used to solve the game is based on reinforcement learning. Specifically, policy iteration is implemented.
Policy iteration is composed by two steps:

Policy evaluation, which identify the value of each state basing on the current policy. This step is performed by using dynamic programming. All the states are evaluated by using the Bellman equation which is based on the action, the next state value, the immediate reward obtained, and the discount factor. The reward gained by the player differs depending on the player choice:
- If the player hits, the next states are generated by simulating the drawing of the next card.
  The reward corresponds to -1 if the player is busted, and 0 otherwise.
- If the player sticks, the combinations of draws that the opponent can reach are generated. For each of these, the corresponding probability of occurring is computed. Also, the combinations of draws are classified in three categories: those causing the player to win the game, those causing the players to tie and those causing the opponent to win the game. Finally, the reward is computed by considering the sum of the probabilities of the draws causing the player to win or to tie the game, rescaled in the range [-1, 1]
Policy improvement, which greedly update the current policy towards the best next possible action. According to the policy improvement theorem this strictly improves the current policy. These two steps are computed iteratively up to convergence.

The game proceeds as it follows:

The user is asked for player and opponent cards.
Example: Assume for example player card is 2 and opponent card is 6
All the possible states are generated. Namely, the combinations of all the allowed card sequences which can be drawn are produced. This process is carried out by first generating all the possible card combinations with a given lenght. Then, we filter out the ones that are not allowed. Specifically:
- If a combination does not match with the cards in the deck, it is filtered out.
- If a combination scores more than 7.5, it is filtered out.
- If a combination does not match with the initial card drawn by the player, it is filtered out
The reinforcement learning applies. As said, we apply policy iteration algorithm, iteratively ciclying on policy evaluation and policy improvements steps.

Installing

The repository can be easily installed by using pip. It is recommended to create a virtual environment to execute the scripts. To install the repository with venv ex:

# Clone the repository
git clone https://github.com/colibri17/sette_mezzo
cd sette_mezzo
# Recommended: Install pip dependencies and run under virtualenv.
sudo apt-get install virtualenv python3-virtualenv
virtualenv -p python3 venv
source venv/bin/activate
# Install requirements
pip freeze > requirements.txt

Executing

Once the repository is installed, you can move within the virtual environment and execute the program:

source venv/bin/activate
python3 src/main.py

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
notebooks		notebooks
src		src
Pipfile		Pipfile
Pipfile.lock		Pipfile.lock
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Summary

Rules

Solving the game

Installing

Executing

About

Releases

Packages

Languages

colibri17/sette_mezzo

Folders and files

Latest commit

History

Repository files navigation

Summary

Rules

Solving the game

Installing

Executing

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages