The master's thesis describes the training and testing process of a convolutional neural network for binary image classification using a selected dataset. The main goal of the thesis is to present and promote the Siemens Industrial Edge system. To achieve this, we tackled the task of distinguishing between two visually similar classes chihuahuas and muffins and transferring the classification decision to an industrial Cartesian manipulator, which sorts image tiles in real time.
The proposed solution consists of two parts: image segmentation, where images of muffins and chihuahuas are separated from the background, and classification using a convolutional neural network. We developed five different CNN architectures that combine convolutional and pooling layers with appropriate activation functions. Hyperparameter optimization was performed using the Adam algorithm, which empirically confirmed stable and efficient model convergence. The selected model was then integrated into the Siemens Industrial Edge environment using the AI Software Development Kit. The implementation involved the Vision Connector, AI Inference Server, Flow Creator, and SIMATIC S7 Connector applications. The classification result is sent to a programmable logic controller, which controls the Cartesian manipulator and thus automatically sorts images into the recognized class.
The main objectives of the master's thesis are: to develop a CNN for binary classification on a selected dataset, to experimentally analyze the performance of the network, and to investigate the impact of various parameters and architectural adjustments on its performance. An additional goal is the integration and testing of the developed system on the Siemens Industrial Edge platform.
The final model, with four convolutional layers, achieves an approximate accuracy just below 99% on the test image set, while the total latency from image capture to mechanical picking remains below 200 ms. All set goals were successfully achieved in the project.
|