Université de Technologie de Compiègne - IA02
This project involved modeling two board games, Dodo
and Gopher
, designed by Mark Steere, and creating a program capable of playing them intelligently.
Pylint scores are somewhat inaccurate due to module imports. To avoid getting import errors we disabled them in
.pylintrc
The major issue we had to solve was to ajust our programs to be efficient enough in a reasonable time.
- Reducing algorithm complexity (using python librairies when possible)
- Find the actual best action
- Don't exceed the time limit
To test our programs we hosted the gnd server on a personnal virtual machine. This way 2 players could connect and play.
The chosen algorithm for playing was Negamax (a variant of Minimax).
Other implemented and tested algorithms:
- Minimax
- Alpha-Beta
- MCTS
We added an alpha-beta pruning to reduce the time of the calculation and a cache to avoid re calculate a second time the evaluation of the grid.
Depths are chosen automatically using another hashtable, that way depth is easily adaptable. After a certain amount of played moves, depth is increased in order to find a winning path. Disadvantage : it is really hard to find out when to switch to a deeper value, game length can be very variable and if we switch too early our program will time out.
def __negamax_depth(self) -> int:
"""depth for negamax"""
if self.__size <= 3:
return 100000
if self.__size == 6 and len(self.__played) > 11:
if self.__neg_update:
print("updaté")
self.__negamax_cache = {}
self.__neg_update = False
return 9
depths: dict[int, int] = {4: 1000, 5: 12, 6: 7, 7: 7, 8: 6, 9: 5, 10: 4}
return depths.get(self.__size, 3)
Disadvantages : With these depths the calculation takes a lot of time to process and might time out in 150s.
The winning strategy for player 1 on odd-numbered grids is implemented in O(1) time with a simple direction calculation
if (
length > 1
and self.__starting == self.__current_player
and self.__size % 2 == 1
):
length -= 1
next_cell: Cell = self.get_direction()
if next_cell in self.__legits:
return next_cell
The calculation is done in self.get_diraction() and is indeed very basic
The first moves for player 1 are fixed depending of the size. (0,0)
or (0,size-1)
if self.__firstmove and self.__size % 2 == 1:
return (0, 0)
if self.__firstmove and self.__size % 2 == 0:
return (0, self.__size - 1)
Negamax is a simplified version of Minimax where the same function is used for both players by inverting the score. This makes the implementation easier and more efficient.
- On a grid of size 4, the Negamax algorithm can solve the game from the start (infine depth).
- On a grid of size 6 with a fixed depth of 9 against a random opponent (random strategy), the win rate over 200 iterations is
100%
.
./test_gopher
: test our program./gopher/gopher_game.py
: main file where strategies are implemented./gopher/game.py
: defines how the class will respond to server
To changes the strategy you can edit the following line in ./gopher/game.py
:
action: ActionGopher = game.strategy_negamax()
The other strategies implemented are stored in unusedcode.txt
.
As we chose the algorithm used by our program relatively soon we haven't updated these strategies and they might not work with the latest version.
The chosen algorithm for playing was Monte Carlo (MC).
Other implemented and tested algorithms:
- Negamax
- We chose to keep an algorithm like MC or MCTS because we get the best results. From observations over a large number of iterations, simulation algorithms perform better when faced with a large number of possible moves than depth-first search algorithms. The explanation might be that simulations can be fixed to a specific number or time, whereas depth-first search explodes node after node.
- Our idea was to combine the MC algorithm for the beginning of the game and then switch to Negamax after a certain number of moves. This would have allowed us to find an optimal solution without costing too much in performance. After implementation, the results were good but not as good as with an MC ~8000 iterations.
- Win rate against a random opponent with the MC algorithm (~8000 iterations):
100%
./test_dodo
: test our program./dodo/dodo_game.py
: main file where strategies are implemented./dodo/game.py
: defines how the class will respond to server
To changes the strategy you can edit the following line in ./dodo/game.py
:
action: ActionDodo = game.strategy_mc(8000)
action: ActionDodo = game.strategy_negamax()
Iteration is fixed in
./dodo/game.py
, it is not stored in a hashtable and adjusted depending on the grid size !
./main.py
: connect to server to play./utils/gndclient.py
: client which allows us to connect to a server./utils/utilitary.py
: utility functions, data
We also tried to implement symetries but it was way too time costing so we removed it on purpose. All algorithms to calculate them are present in
./utils/utilitary.py
.
To run our project, you need to:
- Run the server yourself (which is present in
server/
) - Run the project using
python main.py 1 test test
, for instance - Modify the server url in
main.py
parser.add_argument("-s", "--server-url", default="http://server.url")
To test locally, there is another file named test_gopher.py
(resp test_dodo.py
) that you can use.
This project was very instructive and fun, we competed to find the best solution. The competition made us really implicated in this project and we took it very serious. We tried to go beyond the course to find other algorithms. It was really good fun to test our algorithms playing against friends.
We also thought about training a model using pytorch however :
- the model would have been trained upon an unique grid size at a time
- the model would have taken some time to learn how to play
- the model would have been efficient playing to random actions but maybe not to real players
An efficient way to do it would have been to train a model to search for the best move to explore first inside an MC or MCTS algorithm.
These algorithms work pretty good. During the tournament, We lost to one person on Gopher (warmup) and timed out once wich feels very frustrating. On Dodo We didn't lose except to one person (which we played against twice unfortunately...).
As a conclusion, we are pretty proud of our work, and we learnt a lot about game solving algorithms.
Credits:
[email protected]
[email protected]