This thesis encompasses the development of an educational tool for explanation and demonstration of reinforcement learning at the Faculty of Electrical Engineering at the University of Ljubljana. Reinforcement learning is a subfield of artificial intelligence that became increasingly popular in the recent years in many areas, including robotics. Therefore, we decided to try and familiarise prospective generations of students with basic concepts, approaches and algorithms of reinforcement learning. Furthermore, we wanted to demonstrate the wide area of usability of reinforcement learning through different examples.
First part of the thesis consists of theoretical explanation of reinforcement learning. We define concepts like: agent, policy, action, environment, reward and state. Next, policy-based methods, value-based methods and actor-critic methods are described. Lastly, we present some of the basic reinforcement learning algorithms and their operation.
The second part of the thesis describes the implementation of reinforcement learning on examples in simulation. We chose the most representative algorithms of reinforcement learning (SARSA, Q-learning) and implemented them to solve some of the most classical artificial intelligence tasks. We compared different approaches and presented results. The basic algorithms were no longer sufficient since the difficulty of the tasks started to increase. Therfore we needed to implement more advanced algorithms for reinforcement learning. We used the existing algorithms, part of Stable Baselines3 library. We tried to demonstrate the difference in approaches by comparing the results of solving the same task.
The most demanding part of the thesis was the implementation of reinforcement learning for different robotic applications. We successfully carried out various tasks on robotic manipulator Franka Emika Panda by using models, created in the simulation. We started with the most basic task of moving the end effector of the robotic manipulator into a randomly generated target. Next, we upgraded the system to be able to play simulated air hockey against the opponent. The last and also the most challenging task was to teach the robot to be able to score a goal, using a real ball. Additionally, the robot was able to push differently shaped objects towards the goal.
In the scope of this thesis, we successfully developed an educational tool for demonstration of different reinforcement learning algorithms using various simulation environments. Furthermore, we validated that the possibility of an implementation of reinforcement learning for different robotic tasks is possible. The ability of using simulation-taught models on real robots gives future users wide spectrum of possible of uses.
|