Details

Spodbujevano učenje avtonomnih agentov v videoigri
ID Ciglar, Miha (Author), ID Sadikov, Aleksander (Mentor) More about this mentor... This link opens in a new window

.pdfPDF - Presentation file, Download (1,13 MB)
MD5: 6BA6E9AB4347DEB58F36281872DDF90B

Abstract
V diplomski nalogi preučujemo uporabo spodbujevanega učenja za učenje agentov v 2D igri, ki znano igro Flappy Bird razširi s predmeti, inventarjem, izstrelki in sovražniki. Za učenje izberemo algoritem proximal policy optimization (PPO), predstavimo njegove teoretične osnove in utemeljimo njegovo izbiro za dinamična okolja, ki se hitro spreminjajo. Praktični del zajema pripravo učnega okolja, opazovanj in akcij, sistem nagrajevanja, nevronsko mrežo ter nastavitve hiperparametrov. Učenje poteka postopoma po fazah, z naraščajočo zahtevnostjo za agenta. Rezultati potrjujejo uspešnost pristopa s stabilnim učenjem in smiselnim vedenjem ter izpostavljajo izzive pri uporabi spodbujevanega učenja v hitrih, dinamičnih igrah.

Language:Slovenian
Keywords:spodbujevano učenje, proximal policy optimization (PPO), avtonomni agenti, umetna inteligenca v videoigrah
Work type:Bachelor thesis/paper
Typology:2.11 - Undergraduate Thesis
Organization:FRI - Faculty of Computer and Information Science
Year:2025
PID:20.500.12556/RUL-173314 This link opens in a new window
COBISS.SI-ID:253380611 This link opens in a new window
Publication date in RUL:15.09.2025
Views:182
Downloads:41
Metadata:XML DC-XML DC-RDF
:
Copy citation
Share:Bookmark and Share

Secondary language

Language:English
Title:Reinforcement learning for autonomous agents in a video game
Abstract:
This thesis investigates reinforcement learning for training autonomous agents in a custom 2D side-scrolling game that extends Flappy Bird with items, an inventory system, bullets, and enemy agents. We employ proximal policy optimization (PPO), reviewing its theoretical foundations and justifying its selection for dynamic, fast-paced gameplay environments. The practical part documents environment modelling, observation and action design, reward shaping, network architecture, and hyperparameter choices, together with a curriculum that incrementally introduces more demanding tasks for the player agent. Experiments demonstrate overall stable learning and meaningful in-game behaviour while highlighting the challenges of continual, multi-objective training in fast real-time games.

Keywords:reinforcement learning, proximal policy optimization (PPO), autonomous agents, video game AI

Similar documents

Similar works from RUL:
Similar works from other Slovenian collections:

Back