Convolutional neural networks have proved to be successful in various image processing and computer vision tasks. The aim of the diploma thesis was to create a convolutional neural network that enables the localization of regions of interest (ROIs) in an image. The localization is expressed as a corresponding heatmap, which uses a colour scheme, consisting of a range of cold and warm colours, to identify the importance of individual image regions. The neural network was built using the pretrained VGG16 model, the architecture of which was modified in accordance with the specific demands of the chosen task. The heatmap provided a saliency value for each pixel of the given image. These values were then discretized into multiple levels and each level was encoded with a different quality factor Q by encoding the more important levels at a higher bitrate, and the less important levels at a lower bitrate; the obtained image was then encoded further using the standard JPEG algorithm. The reconstructed images were compared to images that had been compressed and then decoded using standard JPEG compression at Q factors of 30, 50 and 70. For an objective comparison and evaluation of the quality of the reconstructed images the mean squared error (MSE), peak signal-to-noise ratio (PSNR), structural similarity index (SSIM) and multi-scale structural similarity index (MS-SSIM) were used. The results showed that in files of the same or even smaller size the compression method, based on the obtained heatmaps, allows for a lower error (MSE) and higher PSNR values of reconstructed images in comparison with standard JPEG images. The calculated SSIM and MS-SSIM indexes were higher as well, which shows that the visual quality of the reconstructed images matches human evaluation better than the visual quality of standard JPEG images.
|