izpis_h1_title_alt

Uporaba drevesnega preiskovanja Monte Carlo in strojnega učenja za učenje hevristične funkcije
ID FRLIC, KARIN (Author), ID Sadikov, Aleksander (Mentor) More about this mentor... This link opens in a new window

.pdfPDF - Presentation file, Download (669,40 KB)
MD5: 0D99AF792B82A77F62229D403ACAF2EE

Abstract
Algoritem minimaks je eden najbolj razširjenih algoritmov za igranje iger med dvema igralcema. Pri tem se uporablja hevristična funkcija, ki ocenjuje, kako koristno je doseči neko stanje v igri za posameznega igralca. V diplomskem delu poskusimo tako funkcijo za igranje igre Hex ustvariti avtomatsko z uporabo različnih modelov nadzorovanega strojnega učenja. Učne primere za strojno učenje pridobimo s številnimi odigranimi igrami, ki jih simulira MCTS. Ugotovimo, da je igralec, ki za izbiro potez uporablja algoritem minimaks z α-β in naučeno funkcijo, slabši od igralca, ki igra samo z MCTS. Odkrijemo pa, da igralec, ki združi prednosti obeh omenjenih igralcev, igra bolje od MCTS.

Language:Slovenian
Keywords:drevesno preiskovanje Monte Carlo, nadzorovano strojno učenje, algoritem minimaks, hevristična ocenjevalna funkcija, rezanje alfabeta, igra Hex
Work type:Bachelor thesis/paper
Organization:FRI - Faculty of Computer and Information Science
Year:2019
PID:20.500.12556/RUL-106123 This link opens in a new window
Publication date in RUL:30.01.2019
Views:1141
Downloads:267
Metadata:XML DC-XML DC-RDF
:
Copy citation
Share:Bookmark and Share

Secondary language

Language:English
Title:Using Monte Carlo tree search and machine learning to learn a heuristic function
Abstract:
Minimax algorithm is one of the most widely used algorithms for playing two-player games. It uses a heuristic function that estimates the benefits of reaching a given game state for both players. In this bachelor thesis we attempt to automatically construct that kind of a function for the game of Hex. Different models of supervised machine learning are trained on learning samples, generated by simulations of MCTS. As a result, the player that uses minimax with α-β and the learnt function performs worse than the player that uses pure MCTS. However, the player combining advantages of both players achieves better results than MCTS.

Keywords:Monte Carlo tree search, supervised machine learning, minimax algorithm, heuristic evaluation function, alpha-beta pruning, the game of Hex

Similar documents

Similar works from RUL:
Similar works from other Slovenian collections:

Back