In order to increase the accessibility and variety of easy reading in Slovenian, which contains stylistic and language adaptations, we created a prototype of a system that automatically simplifies texts. This is the first system for automatically converting Slovenian sentences and texts into a simpler form. We have prepared a dataset for the Slovenian
language that contains aligned simple and complex sentences, which can be used for further development of models for simplifying texts in Slovenian. We used the slovene T5
model, which is pretrained on other tasks. Namely, the model uses machine learning with knowledge transfer using deep neural networks with an encoder-decoder architecture. To
find good values of hyperparameters and evaluate the performance of the system, we used automatic measures ROUGE and BERTScore, which are high and indicate a good performance of the system. The system generates single-clause or simple multi-clause sentences and does not use adverbs or special symbols. From the syntactic simplicity point of view, the system is successful, but we assessed its success in more detail with the help of human evaluation using a questionnaire that could be used to check the comprehensibility and meaningfulness of automatically generated sentences in further studies. With the questionnaire, we found that the model was not successful in generating comprehensible paragraphs. Most reviewers found them to be almost or completely unintelligible. We also investigated the comprehensibility criteria for automatically generated texts and found that the important comprehensibility criteria are conciseness, linguistic correctness, lexical simplicity, syntactic simplicity, coherence and summary relevance. Our system performed the best in syntactic simplicity and lexical simplicity, and the worst in summary relevance, coherence and conciseness. The system is partly useful as an aid to simplifiers, and could potentially be used in combination with summarization to provide simpler vocabulary and simple syntactic structure.
|