Applying RL to play tic-tac-toe
This work builds upon this source here
note: this is no longer an install-and-play product because it was extended. I should have branched out in retrospect :)
- Added player 2 policy
- Improved human vs computer
- Improved logging
- Performed action space analysis
- Add the ability to train on top of previous policies
- Make above ^ more user-friendly
- Pass command-line arguments
- Extend to 4x4? -- this endeavour has revealed many problems
- Implement other variants?
note: this is based on policy 582023_19237
note: information is extracted from ./analysis_3by3/games.json
This position was played 1549 times!
p1: x p2: o
x|o|x
-+-+-
o|x|x
-+-+-
o|x|o
This position was played 1490 times!
p1: x p2: o
x|x|o
-+-+-
o|x|x
-+-+-
x|o|o
This position was played 1483 times!
p1: x p2: o
x|o|x
-+-+-
x|x|o
-+-+-
o|x|o