Implementacija AlphaZero za igro Risk

Gašperlin, Jan

Implementacija AlphaZero za igro Risk
ID Gašperlin, Jan (Author), ID Sadikov, Aleksander (Mentor) More about this mentor... This link opens in a new window

, ID Možina, Martin (Comentor)

PDF - Presentation file, Download (1,74 MB)
MD5: 618F16050A5B206C2B306ECB775765DD

Abstract

AlphaZero velja za najbolj uspešen algoritem v namizni igri Go. Odlične rezultate je dosegel tudi pri japonskem (shogi) in klasičnem šahu. Klasične igre so dobro raziskane domene, zato smo se odločili raziskati implementacijo za igro Risk. Velik del igre temelji na naključju kock. Za dobro primerjavo zmogljivosti igralnih agentov je zato potrebno izvesti veliko število iger. To je glavni razlog za velik poudarek na hitrosti implementacije. Da bi zadostili tem zahtevam, smo izbrali programski jezik C++ in okolje Tensorflow. Naša implementacija temelji na objavljenih člankih podjetja DeepMind in odprto kodni rešitvi alpha-zero-general. Implementacijo smo nadgradili z večnitnim izvajanjem in ustvarjanjem učnih primerov tekom primerjave modelov. Naučili smo več nevronskih mrež z različnimi nastavitvami. Sprva smo nevronske mreže učili z igrami programiranih igralce, kjer smo dosegli dobre rezultate. Pri učenju z igro s samim seboj smo dosegli povprečne rezultate, saj nismo imeli zadostnega števila iger. Celotna implementacija je na voljo preko spleta.

Language:	Slovenian
Keywords:	AlphaZero, Umetna inteligenca, MCTS, Risk, C++, Tensorflow
Work type:	Master's thesis/paper
Typology:	2.09 - Master's Thesis
Organization:	FRI - Faculty of Computer and Information Science
Year:	2020
PID:	20.500.12556/RUL-122430
COBISS.SI-ID:	42171907
Publication date in RUL:	10.12.2020
Views:	1190
Downloads:	158
Metadata:
:	Copy citation
Share:

Secondary language

Abstract:
Language:	English
Title:	AlphaZero implementation for game Risk
AlphaZero is regarded as the most successful algorithm in board game Go. It achieved excellent results also in Japanese (shogi) and classic chess. Classic games are well-researched domains, that is why we decided to research the implementation for the board game Risk. The game greatly depends on randomness of dice, therefore to compare different gaming agents it is necessary to simulate a large number of games. This is the main reason for the great emphasis on speed of implementation. In order to meet the requirements we chose the C++ programming language and the Tensorflow environment. Our implementation is based on published articles by company DeepMind and open source solution alpha-zero-general. We upgraded our implementation with multi-threaded execution and generating training samples during model comparison. We trained several neural networks with different configurations. At first we trained neural networks with games from programmed players, where we achieved good results. With self-play approach we achieved average results, since we did not have sufficient number of games. The entire implementation is available online.
Keywords:	AlphaZero, Artificial inteligence, MCTS, Risk, C++, Tensorflow

Similar works from RUL:
Similar works from other Slovenian collections:

Secondary language

Similar documents