izpis_h1_title_alt

Uporaba drevesnega preiskovanja Monte Carlo in strojnega učenja za učenje hevristične funkcije
FRLIC, KARIN (Author), Sadikov, Aleksander (Mentor) More about this mentor... This link opens in a new window

.pdfPDF - Presentation file, Download (669,40 KB)
MD5: 0D99AF792B82A77F62229D403ACAF2EE

Abstract
Algoritem minimaks je eden najbolj razširjenih algoritmov za igranje iger med dvema igralcema. Pri tem se uporablja hevristična funkcija, ki ocenjuje, kako koristno je doseči neko stanje v igri za posameznega igralca. V diplomskem delu poskusimo tako funkcijo za igranje igre Hex ustvariti avtomatsko z uporabo različnih modelov nadzorovanega strojnega učenja. Učne primere za strojno učenje pridobimo s številnimi odigranimi igrami, ki jih simulira MCTS. Ugotovimo, da je igralec, ki za izbiro potez uporablja algoritem minimaks z α-β in naučeno funkcijo, slabši od igralca, ki igra samo z MCTS. Odkrijemo pa, da igralec, ki združi prednosti obeh omenjenih igralcev, igra bolje od MCTS.

Language:Slovenian
Keywords:drevesno preiskovanje Monte Carlo, nadzorovano strojno učenje, algoritem minimaks, hevristična ocenjevalna funkcija, rezanje alfabeta, igra Hex
Work type:Bachelor thesis/paper (mb11)
Organization:FRI - Faculty of computer and information science
Year:2019
Views:364
Downloads:196
Metadata:XML RDF-CHPDL DC-XML DC-RDF
 
Average score:(0 votes)
Your score:Voting is allowed only to logged in users.
:
Share:AddThis
AddThis uses cookies that require your consent. Edit consent...

Secondary language

Language:English
Title:Using Monte Carlo tree search and machine learning to learn a heuristic function
Abstract:
Minimax algorithm is one of the most widely used algorithms for playing two-player games. It uses a heuristic function that estimates the benefits of reaching a given game state for both players. In this bachelor thesis we attempt to automatically construct that kind of a function for the game of Hex. Different models of supervised machine learning are trained on learning samples, generated by simulations of MCTS. As a result, the player that uses minimax with α-β and the learnt function performs worse than the player that uses pure MCTS. However, the player combining advantages of both players achieves better results than MCTS.

Keywords:Monte Carlo tree search, supervised machine learning, minimax algorithm, heuristic evaluation function, alpha-beta pruning, the game of Hex

Similar documents

Similar works from RUL:
Similar works from other Slovenian collections:

Comments

Leave comment

You have to log in to leave a comment.

Comments (0)
0 - 0 / 0
 
There are no comments!

Back