izpis_h1_title_alt

Globoko spodbujevalno učenje robotskih strategij
ID UDIR, ANA (Author), ID Mihelj, Matjaž (Mentor) More about this mentor... This link opens in a new window, ID Podobnik, Janez (Comentor)

.pdfPDF - Presentation file, Download (4,32 MB)
MD5: 0C5CCF26BD7618B67E72558684661C85

Abstract
Magistrska naloga obsega izdelavo pripomočka za razlago in učenje spodbujevalnega učenja, ki se ga bo v prihodnosti uporabljalo na Fakulteti za elektrotehniko, Univerze v Ljubljani. Spodbujevalno učenje je podpodročje umetne inteligence, ki je v zadnjih letih vse bolj popularno na številnih področjih med drugim tudi na področju robotike. Zato smo se odločili, da bi prihodnje generacije študentov robotike seznanili z osnovnimi pojmi, pristopi in algoritmi, ter jim na primerih pokazali splošnost uporabe spodbujevalnega učenja. Prvi del magistrske naloge je namenjen razlagi teorije spodbujevalnega učenja. Predstavljeni so pojmi kot so; agent, strategija, akcija, okolje, nagrada in stanje. Nato sledi opis metod spodbujevalnega učenja na osnovi strategije, vrednosti ter metode akter-kritik. Bralca seznanimo z osnovnimi algoritmi spodbujevalnega učenja in njihovim delovanjem. Drugi del magistrske naloge predstavlja praktični oziroma simulacijski del. Izbrali smo si nekaj najbolj reprezentativnih algoritmov spodbujevalnega učenja (SARSA, učenje Q) ter jih implementirali na klasičnih problemih spodbujevalnega učenja. Prikazali smo uspešnost delovanja posameznih algoritmov in jih medsebojno primerjali. Spoznali smo, da enostavnejši algoritmi postanejo z naraščajočo težavnostjo problemov neustrezni. Zato smo za reševanje uporabili naprednejše, že obstoječe algoritme knjižnice Stable Baselines3. S primerjavo rezultatov na istih primerih smo demonstrirali različnost delovanja algoritmov. Najobsežnejši del magistrske naloge je predstavljala implementacija spodbujevalnega učenja za robotske aplikacije. Na podlagi modelov izdelanih v simulacijskem okolju smo robotskega manipulatorja Franka Emika Panda uspešno naučili različnih nalog. Začeli smo z enostavno nalogo premika v tarčo. Po uspešni izvedbi smo sistem nadgradili tako, da je robot igral simuliran namizni hokej proti nasprotniku. Zadnja aplikacija je bila strel na gol oziroma potiskanje naključno postavljenega objekta v tarčo. Izdelali smo poučen in široko uporaben študijski pripomoček za demonstracijo delovanja različnih algoritmov spodbujevalnega učenja v simulacijskem okolju. Poleg tega smo pokazali uspešnost implementacije algoritmov spodbujevalnega učenja za resnične robotske aplikacije. Možnost uporabe modela, naučenega v simulacijskem okolju za aplikacije na robotskih manipulatorjih, predstavlja neomejene možnosti za nadaljnji razvoj.

Language:Slovenian
Keywords:spodbujevalno učenje, nagrada, vrednost Q, nevronska mreža, robot
Work type:Master's thesis/paper
Organization:FE - Faculty of Electrical Engineering
Year:2022
PID:20.500.12556/RUL-140003 This link opens in a new window
COBISS.SI-ID:121228035 This link opens in a new window
Publication date in RUL:09.09.2022
Views:1330
Downloads:207
Metadata:XML DC-XML DC-RDF
:
Copy citation
Share:Bookmark and Share

Secondary language

Language:English
Title:Deep reinforcement learning of robotic strategies
Abstract:
This thesis encompasses the development of an educational tool for explanation and demonstration of reinforcement learning at the Faculty of Electrical Engineering at the University of Ljubljana. Reinforcement learning is a subfield of artificial intelligence that became increasingly popular in the recent years in many areas, including robotics. Therefore, we decided to try and familiarise prospective generations of students with basic concepts, approaches and algorithms of reinforcement learning. Furthermore, we wanted to demonstrate the wide area of usability of reinforcement learning through different examples. First part of the thesis consists of theoretical explanation of reinforcement learning. We define concepts like: agent, policy, action, environment, reward and state. Next, policy-based methods, value-based methods and actor-critic methods are described. Lastly, we present some of the basic reinforcement learning algorithms and their operation. The second part of the thesis describes the implementation of reinforcement learning on examples in simulation. We chose the most representative algorithms of reinforcement learning (SARSA, Q-learning) and implemented them to solve some of the most classical artificial intelligence tasks. We compared different approaches and presented results. The basic algorithms were no longer sufficient since the difficulty of the tasks started to increase. Therfore we needed to implement more advanced algorithms for reinforcement learning. We used the existing algorithms, part of Stable Baselines3 library. We tried to demonstrate the difference in approaches by comparing the results of solving the same task. The most demanding part of the thesis was the implementation of reinforcement learning for different robotic applications. We successfully carried out various tasks on robotic manipulator Franka Emika Panda by using models, created in the simulation. We started with the most basic task of moving the end effector of the robotic manipulator into a randomly generated target. Next, we upgraded the system to be able to play simulated air hockey against the opponent. The last and also the most challenging task was to teach the robot to be able to score a goal, using a real ball. Additionally, the robot was able to push differently shaped objects towards the goal. In the scope of this thesis, we successfully developed an educational tool for demonstration of different reinforcement learning algorithms using various simulation environments. Furthermore, we validated that the possibility of an implementation of reinforcement learning for different robotic tasks is possible. The ability of using simulation-taught models on real robots gives future users wide spectrum of possible of uses.

Keywords:reinforcement learning, reward, Q value, neural network, robot

Similar documents

Similar works from RUL:
Similar works from other Slovenian collections:

Back