Details

Adapting AlphaZero for Three-Player Hexagonal Chess
ID Vasiljević, Jan (Author), ID Bajec, Marko (Mentor) More about this mentor... This link opens in a new window, ID Pirker, Johanna (Comentor), ID Sadikov, Aleksander (Comentor)

.pdfPDF - Presentation file, Download (4,34 MB)
MD5: 39728A0621961FEA29F8783E12D29551

Abstract
This thesis adapts the AlphaZero framework for Three-Way Chess, a three-player variant defined by hexagonal geometry and complex coalition dynamics. To address the lack of software frameworks, a high-performance training ecosystem was developed for resource-constrained hardware. A transformer-based architecture incorporating relative positional embeddings was introduced to capture the board's unique spatial relationships. Methodological validation in Three-player Hex demonstrated that canonical input representations and geometric embeddings significantly enhance learning efficiency. In Three-Way Chess, the agent autonomously discovered advanced tactics but initially adopted a passive survivalist strategy to avoid drawing aggression. Refining the objective with material incentives corrected this behaviour, resulting in competitive performance against human opponents. These findings suggest that the efficacy of self-play in non-zero-sum multiplayer environments depends on the underlying game structure and may require additional fine-tuning.

Language:English
Keywords:multiplayer chess, AlphaZero, transformer, deep reinforcement learning, non-zero-sum games, game theory
Work type:Master's thesis/paper
Typology:2.09 - Master's Thesis
Organization:FRI - Faculty of Computer and Information Science
Year:2025
PID:20.500.12556/RUL-177428 This link opens in a new window
COBISS.SI-ID:263051523 This link opens in a new window
Publication date in RUL:23.12.2025
Views:67
Downloads:12
Metadata:XML DC-XML DC-RDF
:
Copy citation
Share:Bookmark and Share

Secondary language

Language:Slovenian
Title:Prilagoditev AlphaZero za trostranski heksagonalni šah
Abstract:
Magistrska naloga obravnava prilagoditev ogrodja AlphaZero za varianto heksagonalnega šaha za tri igralce, ki jo definira zapletena dinamika zavezništev. Zaradi pomanjkanja odprtokodnih rešitev je bilo razvito visokozmogljivo ogrodje za učenje na strojni opremi z omejenimi viri. Za učinkovito modeliranje specifične geometrije heksagonalne plošče je bila uvedena arhitektura na osnovi transformerjev, ki vključuje relativne pozicijske vložitve. Metodološka validacija na primeru igre Hex za tri igralce je pokazala, da kanonične vhodne predstavitve in geometrijske vložitve bistveno izboljšajo učinkovitost učenja. Pri šahu za tri igralce je agent samostojno odkril napredne taktične motive, vendar se je sprva naučil pasivne igre, da ne bi postal tarča napadov. Prilagoditev sheme nagrajevanja z materialnimi spodbudami je izboljšala vedenje in posledično privedla do konkurenčne igre proti človeškim nasprotnikom. Izsledki nakazujejo, da je učinkovitost učenja s samo-igranjem v okoljih za več igralcev z neničelno vsoto odvisna od osnovne strukture igre in pogojno zahteva dodatne prilagoditve.

Keywords:večigralski šah, AlphaZero, transformer, globoko spodbujevano učenje, igre z ne-ničelno vsoto, teorija

Similar documents

Similar works from RUL:
Similar works from other Slovenian collections:

Back