The goal of super-resolution is to obtain a high-resolution (HR) image from a low-resolution image (LR). Super-resolution is used in several areas, e.g. to improve the quality of image data for object detection in images, face recognition in surveillance images, medical images, astronomical images, and forensics. Super-resolution is still a difficult and open problem of computer vision. Super-resolution is inherently ill-posed. Instead of one solution, there are several HR images that equally well explain a given LR image. The severity of the problem increases with an increasing scale factor. In addition, it is difficult to assess the quality of the output, as numerical metrics do not correspond completely to human perception.
The most advanced models in this field are based on learning from pairs of LR and HR images. Because such learning depends on the characteristics of the data, existing models are not equally successful in all types of images and consequently exhibit a certain type of bias. In this thesis, we analyze the bias of five state-of-the-art super-resolution models. We use various metrics from the literature to measure the performance of the models. In addition to the bias analysis, we also analyze the impact of use of super-resolution techniques on the performance of a face detector on very small facial images that were enhanced with super-resolution models. Finally, face recognition performance is studied on super-resolved images.
Our experimental results show that, given the selected non-reference metrics, there are mostly no large differences between the attributes we analyze. Larger differences between some characteristics occur when compared using reference metrics. Minor differences occur when we analyze bias by gender, where there are better results for males. According to the race the most consistent results are in images showing Asians and the maximum range in images with black people. The best results are in the age group 30-49 and the worst for people aged 90 years or more. In the analysis of the influence of the view-angle on the performance of super-resolution techniques, it was found that the models work better, the more the face deviates from the frontal position. The biggest differences occur in cases of facial occlusion. We observe the weakest results when multiple attributes are present in the face images that also have worse performance when examined individually.
The results show that super-resolution procedures do not contribute to a more successful detection of low-resolution faces. When analyzing face recognition, the results show that the super-resolution networks, which achieve the best results in terms of metrics, provide images that ensure greater performance of face recognition. However, hallucinated images do not come close to the results observed with HR-HR image comparisons. The face recognition performance on hallucinated real images is worse than on subsampled images, but it also depends on the type of surveillance camera used for image capture.
|