In this thesis, we explored text simplification in Slovene using large language models. The goal of the thesis was to develop models that can effectively simplify Slovene texts. We used existing English training datasets, which we machine-translated into Slovene, and then trained models such as SloT5, mT5, and mBART on these data. We conducted quantitative and qualitative analysis of the results, using metrics such as BLEU, SARI, BERTScore, and LaBSE Similarity. The results showed that the models can successfully simplify texts, retain key information, and meaningfully simplify the structure and language. Despite the successful simplifications, the models often repeat the original sentences without significant changes.
|