AlphaZero is regarded as the most successful algorithm in board game Go. It achieved excellent results also in Japanese (shogi) and classic chess. Classic games are well-researched domains, that is why we decided to research the implementation for the board game Risk. The game greatly depends on randomness of dice, therefore to compare different gaming agents it is necessary to simulate a large number of games. This is the main reason for the great emphasis on speed of implementation. In order to meet the requirements we chose the C++ programming language and the Tensorflow environment. Our implementation is based on published articles by company DeepMind and open source solution alpha-zero-general. We upgraded our implementation with multi-threaded execution and generating training samples during model comparison. We trained several neural networks with different configurations. At first we trained neural networks with games from programmed players, where we achieved good results. With self-play approach we achieved average results, since we did not have sufficient number of games. The entire implementation is available online.
|