In this work, we examine the impact of label noise on the actual recognition of areas in satellite images. Acquiring labels in this field is challenging, as many of the labels are obtained from sources which are not aligned with the image data. Spatial deviations and misclassifications of specific regions also occur. In this thesis, we discuss several established machine learning methods, which were then tested on the different types of noise that can be present in satellite image labels, with an in-depth focus on those deep learning methods that achieve satisfactory results in computer vision. These methods are already more or less robust when it comes to label noise. Additionally, we tested the DivideMix framework, which is specifically designed for learning from noisy data. The impact of noise is experimentally evaluated on the real problem of determining the actual use of agricultural and forest land in the Republic of Slovenia. The results of this thesis show that deep learning methods are robust to low to medium levels of label noise. However, when the level of label noise is high, the DivideMix framework can be used to improve results. Next to that, classical machine learning methods have also proven to be very robust.
|