<?xml version="1.0"?>
<metadata xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dc="http://purl.org/dc/elements/1.1/"><dc:title>Motion prediction in human-robot collaboration using deep recurrent neural networks</dc:title><dc:creator>MAVSAR,	MATIJA	(Avtor)
	</dc:creator><dc:creator>Ude,	Aleš	(Mentor)
	</dc:creator><dc:subject>deep learning</dc:subject><dc:subject>recurrent neural networks</dc:subject><dc:subject>motion prediction</dc:subject><dc:subject>human-robot collaboration</dc:subject><dc:subject>adaptive robot control</dc:subject><dc:subject>motion recognition</dc:subject><dc:subject>dynamic movement primitives.</dc:subject><dc:description>Collaboration between humans and robots has become increasingly popular in the last decade, since it enables complex tasks while alleviating human workers of stressful and demanding labor. Safe and efficient human-robot cooperation requires an effective system for supervision and control of the collaborative workspace. A number of deep neural network architectures have been developed that are suitable for analysis and prediction of dynamic processes, which often occur in collaborative environments. Furthermore, data augmentation methods, such as simulation, data randomization and synthetic data generation, can additionally improve the performance of motion prediction systems by incorporating diverse information into training data. In this dissertation, a number of approaches to enable robot and human motion prediction in collaborative tasks are proposed, where various neural network architectures and data augmentation methods are designed and tested.

In the first part of the dissertation, the focus lies on optimizing task fluency by designing a collaboration supervision system that comprises automatic motion detection and motion classification, where the latter is used for categorizing observed human motions during human-robot collaboration (HRC) tasks. Firstly, a recurrent neural network system for human motion classification from RGB-D videos is compared to a system that makes predictions based on input marker positions, showing that classification accuracy is comparable for both input types. Secondly, a recently more popular architecture, namely a transformer network, is employed and compared to a custom recurrent network, as well as to an adapted existing architecture used for action recognition in HRC. The proposed networks outperform the existing model, while the use of one-dimensional convolutional and pooling layers further increases accuracy of motion classification. The use of third-order DMPs is proposed for description of robot tasks and for enabling smooth robot motion adaptation in real time when new predictions of observed motion are computed using motion classification networks. The developed methods are implemented in a real collaborative use case and result in a more fluent and safe task sharing between a human and a robot. 

In the second part, the use of generative adversarial networks (GANs) for supplementation of real data for motion classification tasks is explored. A training methodology based on GANs that utilizes a recurrent architecture for generation of synthetic robot and human motion videos during a collaborative task is introduced. The architecture is trained in a semi-supervised manner, with the output classification networks predicting one of the possible labels for the observed motion, while the recurrent generator networks produce synthetic RGB videos that are leveraged in the training process. Results show that utilization of synthetic data during the semi-supervised training increases the accuracy and generalization capability of the trained motion classification models.

In the final part of the dissertation, recurrent architectures for motion prediction during object handover tasks are presented. First, an end-to-end recurrent neural network for predicting robot motion during a robot-to-robot object handover task is developed. The network processes input color-depth (RGB-D) videos of a giver robot, passing an object to another, receiving robot, and outputs the desired trajectory of the receiving robot, or the predicted trajectory that the giver robot will perform. This enables adaptive control of the receiving robot, which can start moving towards the predicted exchange location as soon as the giver begins its motion, resulting in a more fluent and dynamic handover process. Techniques for automatic generation of highly randomized simulated robot motion videos and data augmentation to increase the size of the training dataset are proposed, showing that mixing real and simulated data is beneficial for the accuracy of motion prediction. Secondly, a system for object handover location prediction from input human hand trajectories is designed and implemented in a real human-to-robot handover experiment using a humanoid robot.

The developed methodologies presented in the dissertation have been tested and shown to achieve accurate motion prediction during cooperative tasks, enabling dynamic collaboration of humans and robots.</dc:description><dc:date>2024</dc:date><dc:date>2024-09-25 10:50:01</dc:date><dc:type>Doktorsko delo/naloga</dc:type><dc:identifier>162577</dc:identifier><dc:identifier>VisID: 59888</dc:identifier><dc:identifier>COBISS_ID: 219480323</dc:identifier><dc:language>sl</dc:language></metadata>
