In this master’s thesis, we investigate the application of the segmentation architecture Fast Point Transformer (FPT) for the recognition of chess pieces from 3D point clouds. The problem we address is the automation of recognition and localization of small and similar objects in space using a depth camera.
In our approach, we mounted a Realsense D435 camera on an HC10 robotic arm to capture data on the position and color of points, which were used to train two segmentation models based on the FPT architecture. The process includes automatic data acquisition and annotation, thereby facilitating efficient model training, as well as the usage of the model to localise the chess pieces.
The results of our work indicate that the FPT method is effective in recognizing and localizing chess pieces with minimal error. In our experiment, we trained two separate models. The first model, which distinguishes between background and chess piece points, correctly labeled 80.81% of chess piece points, while the second, which classifies points among individual pieces, labeled an average of 97.35% correctly. Each model was trained, validated, and tested on 3,545 distinct point cloud images, totaling 9.15 GB. The maximum measured positioning error during robotic manipulation was 1 cm, which was sufficient for reliable handling by our Zimmer gripper. The method is suitable for industrial use, offering flexibility when switching between different object types.
|