In this diploma thesis, a wearable prototype for the detection of hand gestures is implemented and evaluated, which works on the OAK-D embedded device from the DepthAI platform. Embedded device is capable of efficient image capture and image processing using various computer vision operations, including deep neural networks. Using a sequence of neural networks and intermediate operations, the prototype determines the position of the hand in first-person mode, tracks the hand and, based on the time course of the hand position, determines the gesture. All of this runs almost entirely on the embedded device, offloading the host system and enabling low detection latencies. For practical testing, music player control is implemented. For this purpose, a dataset of gestures has been collected, which, despite its limited scope, enables the system to reliably learn to recognize different gestures. The system is experimentally evaluated on a test set, where it achieves the desired accuracy. It also performs well in a real-world scenario where the system has been tested by test users controlling music playback in real-time.
|