In the last five years, neural machine translators have become the primary choice among translation systems. Research shows they produce higher-quality translations which is why they are quickly replacing older statistical systems. While they seem to have already achieved human parity for some language combinations, neural machine translation involving Slovenian is still a relatively poorly investigated territory. In addition to the classical quality assessment, this thesis tries to address the question of lexical richness of machine translations, possible differences compared to human translations and statistical machine translators on a macro level. To discover these differences, we use quantitative methods to analyse lexical density and lexical diversity, first for human translations of a literary, technical and culinary text, then their machine translations using two neural and one statistical machine translation system. With regard to lexical density, our findings show that one of the neural translators (Google Translate) is the closest to the human translation, and that it generally outperforms the other two systems in terms of translation quality. From the perspective of lexical richness, results obtained by quantitative methods show that machine translations exhibit greater variation of vocabulary than human translations. However, our qualitative analysis has shown that results obtained through quantitative methods, particularly those regarding lexical variety, are not always reliable. Thus the findings should be further investigated or replicated using more precise and targeted methods.
|