The goal of this thesis is to create a sentiment dictionary for the Slovenian language which can be used in lexical methods for automatic sentiment analysis.
We start from a sentiment dictionary for the English language, translate it semi-automatically to Slovenian and curate its content. We test the performance of using the translated dictionary for automated lexical sentiment analysis on a corpus of 5000 manually annotated Slovenian news articles gathered from the main Slovenian news portals. The results of the analysis are compared with the results of an alternative method, where, instead of translating the sentiment dictionary, the documents are translated to English and lexical sentiment analysis is performed.
This thesis is organized as follows. First, the concept and motivation for automated sentiment analysis are introduced. Next, the techniques for sentiment analysis are outlined, stressing the importance of sentiment dictionaries in automated sentiment analysis. The main part of the thesis is Chapter 4, in which the process of creating the Slovenian sentiment dictionary is described and explained in detail. Furthermore, the manual article annotation process is described and the experimental evaluation of the two alternative methods is performed.
Within the practical part of this thesis, a Slovenian sentiment dictionary and a manually annotated corpus of 5000 Slovenian news articles were created.