izpis_h1_title_alt

Posplošitev problema vozička s palico na zahtevnejše domene
ID SVETE, ANEJ (Author), ID Bratko, Ivan (Mentor) More about this mentor... This link opens in a new window

.pdfPDF - Presentation file, Download (1,09 MB)
MD5: 71F85B83B864CFE9AC0FFD7C1523BE20

Abstract
Problem vozička s palico pogosto uporabljamo za preizkušanje uspešnosti krmilnikov. Obstaja več različic, ki vsebujejo vrsto izzivov, vendar pa z enostavnostjo ne ujamejo nekaterih realnih situacij. V diplomski nalogi zato predlagamo dve razširitvi, ki vključujeta združitev s problemom vozila v dolini in nalogo, v kateri se mora voziček premakniti pod oviro. Izpeljemo enačbe dinamike sistema in predstavimo implementacijo. Opišemo izzive, ki jih razširitve porajajo, in pregledamo sorodno delo, s katerim si pomagamo pri definiciji in reševanju. Ker to ne vsebuje standardnega pristopa, obstoječe ideje združimo v sistem, ki ponuja primerna izhodišča tudi za nadaljnje delo. Na dveh znanih različicah problema in na naših razširitvah naučimo agenta z globokim spodbujevanim učenjem in rezultate opišemo ter interpretiramo. Ti kažejo, da sta razširitvi občutno zahtevnejši od standardnih problemov in vsebujeta izzive, s katerimi se spodbujevano učenje težko spopada. Medtem ko je učenje na znanih različicah hitro in stabilno, na razširitvah traja znatno dlje in je manj zanesljivo. Ker naloga zahteva tvegano vedenje agenta, se ta pogosto nauči strategij, ki naloge ne opravijo, ponujajo pa varnejši način uravnovešanja palice. Vseeno v naučenem vedenju najdemo pričakovane in inteligentne vzorce, zaradi katerih je agent učinkovit. Naloga poleg definicije domen in opisa reševanja ponuja tudi nove izzive ter možnosti za nadaljevanje.

Language:Slovenian
Keywords:umetna inteligenca, strojno učenje, spodbujevano učenje, globoko učenje, teorija krmiljenja, testni problem, voziček s palico
Work type:Bachelor thesis/paper
Typology:2.11 - Undergraduate Thesis
Organization:FRI - Faculty of Computer and Information Science
FMF - Faculty of Mathematics and Physics
Year:2020
PID:20.500.12556/RUL-118155 This link opens in a new window
COBISS.SI-ID:32853763 This link opens in a new window
Publication date in RUL:24.08.2020
Views:1189
Downloads:293
Metadata:XML DC-XML DC-RDF
:
Copy citation
Share:Bookmark and Share

Secondary language

Language:English
Title:Generalization of the cart pole problem to more difficult domains
Abstract:
The cart-pole problem is often used to test the performance of controllers. Multiple variants exist and they present a series of challenges, however, their simplicity fails to capture some realistic settings. Therefore, in the thesis, we suggest two extensions of the problem, namely merging the original one with the mountain car problem and an environment where the cart must move under an obstacle. We derive the equations of the cart-pole dynamics on an uneven surface and present our implementation. We also describe the challenges the extensions introduce and survey similar work. Since the literature does not offer a standard approach, we combine the existing ideas into a system which offers well-defined starting points for further work. We teach an agent on two known variants of the problem and our two extensions using deep reinforcement learning and present the experimental results. These show the severely more demanding nature of the extensions and expose their challenges, which pose difficulties to reinforcement learning. While learning on the standard tasks is relatively fast and stable, it is much slower and less reliable on the extensions. The risky behaviour demanded by the task often leads to strategies which do not accomplish the task but offer a safer way of balancing the pole. Nevertheless, we find anticipated and intelligent patterns in the agent's behaviour. The work also exposes many possibilities for further work.

Keywords:artificial intelligence, machine learning, reinforcement learning, deep learning, control theory, benchmark problem, cart pole

Similar documents

Similar works from RUL:
Similar works from other Slovenian collections:

Back