The thesis presents the development of a grammar error handling system, which was divided into three sub-problems: error detection, recognition, and correction. We addressed these problems using large language models based on the transformer architecture. Specifically, we used the SloBERTa model, the Slovenian version of the BERT model, to detect and recognize grammatical errors. Additionally, we used the SloT5 model, the Slovenian version of the T5 model, to correct grammatical errors. The models were trained and evaluated on the Slovene corpora of grammar corrections Šolar and Lektor. We also used the Slovene morphological lexicon Sloleks and the Classla-Stanza tagging tool. To evaluate the performance of the models, we used several metrics. The detection and recognition models achieved the F-score of 88% and 14%, respectively. The correction model achieved the GLEU score of 50%.
|