The Thesis dealt with machine learning-based classification of the sentimental impact of the comments posted with news articles on the web. In the past years sentiment analysis has become an important research topics with substantial number of publications for texts in English, while for the Slovene, except in the recent thesis at the University of Ljubljana, Faculty of Computer and Information science, the topic has not been explored well. In relation to all the features of the Slovenian language this represented an additional challenge. Our goal was to correctly classify these comments as positive or negative. We examined how this problem differs from the topical classification of texts. Our work shows that the problem is hard and that a typical application of machine learning based on k-mer representation of text does not yield the expected results. A possible reason for poor performance may be lack of semantic information in such representation and short length of the texts.
|