Space recognition is an interesting computer vision problem with many practical applications. Improvements in field of mobile robotics will most likely increase the need for efficient and accurate scene recognition systems. Lately, room classification methods have reached high classification accuracy with the use of popular convolutional neural networks, trained on large datasets, but most of the methods are based on holistic classification. Their disadvantage shows when presented with an image of multiple places. In this thesis we present a method that addresses the disadvantage of existing methods by use of semantic segmentation. In the work we focus on recognizing 8 most common indoor place categories. We improved and changed an existing dataset according to the problem and used it to build and train three convolutional neural networks with different numbers of fully-connected layers. We evaluated their segmentation and detection accuracy with use of mean intersection-over-union measure and F-measure, respectively, then compared obtained results with those of an existing holistic classification network, which achieves state-of-the-art results on the task of image-level classification. We also give a qualitative analysis of trained networks' results. Results show that our method outperforms the current state-of-the-art method by almost 40\% on the task of place localization and by 20\% on the task of place recognition.
|