Uporaba drevesnega preiskovanja Monte Carlo in strojnega učenja za učenje hevristične funkcije

FRLIC, KARIN

Uporaba drevesnega preiskovanja Monte Carlo in strojnega učenja za učenje hevristične funkcije
ID FRLIC, KARIN (Author), ID Sadikov, Aleksander (Mentor) More about this mentor... This link opens in a new window

PDF - Presentation file, Download (669,40 KB)
MD5: 0D99AF792B82A77F62229D403ACAF2EE

Abstract

Algoritem minimaks je eden najbolj razširjenih algoritmov za igranje iger med dvema igralcema. Pri tem se uporablja hevristična funkcija, ki ocenjuje, kako koristno je doseči neko stanje v igri za posameznega igralca. V diplomskem delu poskusimo tako funkcijo za igranje igre Hex ustvariti avtomatsko z uporabo različnih modelov nadzorovanega strojnega učenja. Učne primere za strojno učenje pridobimo s številnimi odigranimi igrami, ki jih simulira MCTS. Ugotovimo, da je igralec, ki za izbiro potez uporablja algoritem minimaks z α-β in naučeno funkcijo, slabši od igralca, ki igra samo z MCTS. Odkrijemo pa, da igralec, ki združi prednosti obeh omenjenih igralcev, igra bolje od MCTS.

Language:	Slovenian
Keywords:	drevesno preiskovanje Monte Carlo, nadzorovano strojno učenje, algoritem minimaks, hevristična ocenjevalna funkcija, rezanje alfabeta, igra Hex
Work type:	Bachelor thesis/paper
Organization:	FRI - Faculty of Computer and Information Science
Year:	2019
PID:	20.500.12556/RUL-106123
Publication date in RUL:	30.01.2019
Views:	1136
Downloads:	267
Metadata:
:	Copy citation
Share:

Secondary language

Abstract:
Language:	English
Title:	Using Monte Carlo tree search and machine learning to learn a heuristic function
Minimax algorithm is one of the most widely used algorithms for playing two-player games. It uses a heuristic function that estimates the benefits of reaching a given game state for both players. In this bachelor thesis we attempt to automatically construct that kind of a function for the game of Hex. Different models of supervised machine learning are trained on learning samples, generated by simulations of MCTS. As a result, the player that uses minimax with α-β and the learnt function performs worse than the player that uses pure MCTS. However, the player combining advantages of both players achieves better results than MCTS.
Keywords:	Monte Carlo tree search, supervised machine learning, minimax algorithm, heuristic evaluation function, alpha-beta pruning, the game of Hex

Similar works from RUL:
Similar works from other Slovenian collections:

Secondary language

Similar documents