The aim of the thesis is to add the rules for comma usage to the LanguageTool program. Using the Lektor corpus, we examined which rules for comma usage are causing the most issues in written Slovene. In view of these results, we analyzed the rules for comma usage before conjunctions »and«, »or« and »that« in Slovenian ortography 2001. After finishing the analysis, we tried to implement comma placement rules for the open source program LanguageTool, which can be used as a stand-alone desktop application, as web interface or in open source office suites LibreOffice and OpenOffice. Some of the rules were successfully implemented. For all of the rules to be implemented, we would need the part-of-speech tagger, which is not a part of the LanguageTool for Slovene, yet. We evaluated the rules, taking their accuracy, applicability and user experience into account.
|