Drevesno preiskovanje Monte Carlo s konvolucijsko nevronsko mrežo za igranje Gomoku

Chen, Qichao

Drevesno preiskovanje Monte Carlo s konvolucijsko nevronsko mrežo za igranje Gomoku
ID Chen, Qichao (Author), ID Šter, Branko (Mentor) More about this mentor... This link opens in a new window

PDF - Presentation file, Download (765,85 KB)
MD5: 73A4A63577205934DD8598844E552659

Abstract

Cilj diplomske naloge je bil z uporabo drevesnega preiskovanja Monte Carlo (Monte Carlo Tree Search, MCTS) in nevronske mreže narediti inteligentnega agenta za igro Gomoku. Uporabili smo pristop agenta Alpha Zero, ki je v svojem algoritmu združil drevesno preiskovanje Monte Carlo in konvolucijsko nevronsko mrežo. Podobno kot AlphaGo Zero se je tudi naš agent učil brez kakršnegakoli predznanja o igri Gomoku, poznal je le pravilo igre. Učil se je s samoigranjem. Po 1500 samoigrah je premagal računalniškega igralca, ki je uporabljal samo MCTS. Pri ocenjevanju primerjave s človekom je dosegel zadovoljive rezultate, saj ga človek težko premaga. Agent zna namreč zelo dobro blokirati in prepoznavati tipične grožnje, ki jih človek uporabi za zmago.

Language:	Slovenian
Keywords:	drevesno preiskovanje Monte Carlo, konvolucijska nevronska mreža, AlphaGo Zero, Alpha Zero, Gomoku
Work type:	Bachelor thesis/paper
Organization:	FRI - Faculty of Computer and Information Science
Year:	2019
PID:	20.500.12556/RUL-110566
COBISS.SI-ID:	1538373059
Publication date in RUL:	17.09.2019
Views:	1327
Downloads:	181
Metadata:
:	Copy citation
Share:

Secondary language

Abstract:
Language:	English
Title:	Monte Carlo Tree Search with a convolutional neural network for playing Gomoku
The goal of the thesis was to use the Monte Carlo Tree Search (MCTS) and deep neural networks to build an intelligent agent for the game of Gomoku. We used the Alpha Zero approach that has combined Monte Carlo Tree Search and a convolutional neural network. Just like Alpha Zero, our agent was trained solely from self-play, without any human knowledge about the game; it was told only the rules of the game. After 1500 games of self-play it defeated a computer player, which was built with pure MCTS. It has also reached satisfactory results in games against human players, it is hard to be defeated by human players. Namely, the agent can identify the typical threats, which human players use to win.
Keywords:	Monte Carlo tree search, convolutional neural network, AlphaGo Zero, Alpha Zero, Gomoku

Similar works from RUL:
Similar works from other Slovenian collections:

Secondary language

Similar documents