This repo contains both the full implementation of the AlphaZero algorithm 🤖 and a detailed PDF report
This project applies AlphaZero to the game of Othello (Reversi) ⚫⚪
It combines cutting-edge techniques in reinforcement learning and deep learning to create a self-learning game-playing agent
- 🚀 AlphaZero Algorithm – Self-play, learning from scratch, and no human input needed 🎯🧠
- 🧩 Neural Network Architecture – CNNs and dense layers to predict moves and game outcomes 🖥️🔮
- 🌲 Monte Carlo Tree Search (MCTS) – How the agent searches and improves over time 🌳🕵️♀️
- 🔬 Training & Experiments – 6×6 and 8×8 board configurations, performance stats, and comparisons 📈🧪
- 🧠 Conclusions & Learnings – Key takeaways and ideas for improvements ✨📘
The code is a fork of https://github.com/suragnair/alpha-zero-general
The improvements in the code result in a 10x speedup in training time. As a result, the model can also be trained on lower-end hardware; I used a GTX 1050 Ti to obtain the reults showed in the report. Additional improvements come from a different training strategy, which accelerated the policy learning process compared to traditional methods.