This master's thesis presents the development of a software framework for learning robot strategies using deep reinforcement learning. In the field of robotics today, the autonomy of robots is in most cases limited by classical control methods, but the desire is to bring robots closer to the efficiency and adaptability of humans, especially in complex and dynamic environments. This has led to a new field of research where reinforcement learning is proving to be a promising approach. In our work, we first explored the possibilities of using reinforcement learning in the field of non-prehensile manipulation of objects, more specifically by pushing them. In the second part, we upgraded our software framework with a simulation environment for learning robotic strategies with deep reinforcement learning on a Franka Emika Panda robot model, in order to transfer the learned strategies to a real robot.
The first part included the review and selection of appropriate open-source tools for the implementation of deep reinforcement learning algorithms. We selected the Stable-Baselines3 (SB3) repository of implemented deep reinforcement learning algorithms, the PyBox2D and MuJoCo software libraries for building physics simulation, and the Gymnasium software interface for interfacing all the selected tools. This was followed by building the physics simulation environments in the selected software libraries and then learning the pushing tasks in the simulation through deep reinforcement learning. At the end of the first part, we transferred the models to a real robot and tested the performance of the whole system.
In the second part, we created a simulation environment with a high-quality model of the Franka Emika Panda robot. In this simulation environment we then used deep reinforcement learning to teach the robot the task of picking up an object while avoiding an obstacle. We also implemented velocity control of the robot in his joint coordinates. We then transferred the learned model from simulation to a real robot and tested the performance on the real system.
Finally, we present the results of comparing selected deep reinforcement learning algorithms DQN (Deep Q Network) and TQC (Truncated Quantile Critics), results of comparing physical simulation environments PyBox2D and MuJoCo, the success rates of learned models for pushing tasks, both in simulation and on the real robot and the results of the object-picking task with our robot model. We also comment on the obtained results and the challenges encountered during the work and propose possible solutions and directions for further work.
|