This is the implementation of a Reinforcement Learning agent I developed that learns how to play reversi. The algorithm it's based on is called UCT (Upper Confidence bounds applied to Trees) which is a variation from traditional Monte Carlo Tree Search algorithms, taking advantage of the UCB algorithms for solving the K-armed bandit problem. To run the agent call the function uct in matlab or octave.
If you have any questions regarding this implementation or the agent feel free to contact me at [email protected]