izpis_h1_title_alt

Analiza tematik in sentimenta slovenskih medijev z orodji za obdelavo naravnega jezika
ID BAJT, JAN (Author), ID Robnik Šikonja, Marko (Mentor) More about this mentor... This link opens in a new window

.pdfPDF - Presentation file, Download (1,44 MB)
MD5: 1DCF397CD6BE3127A66555A3EF993F3F

Abstract
V diplomskem delu primerjamo slovenske medije s pomočjo analize tematik in sentimenta člankov. Želeli smo analizirati različna stališča medijev do specifičnih političnih dogodkov oziroma tematik. Tematike smo modelirali z modelom LDA, s katerim smo v množici slovenskih člankov poiskali tiste s politično vsebino. Za nalogo zaznavanja sentimenta smo prilagodili model SloBERTa in ga uporabili pri klasifikaciji izbranih člankov v eno izmed treh oznak (pozitivno, nevtralno, negativno). Primerjavo medijev izvedemo na nekaj različnih političnih temah, kjer opazimo nekaj razlik med skupinami medijev. Rezultate predstavimo in izpostavimo nekaj slabosti našega sistema ter podamo predloge za izboljšavo.

Language:Slovenian
Keywords:obdelava naravnega jezika, model BERT, latentna Dirichletova alokacija, modeliranje tematik, detekcija sentimenta, slovenski mediji
Work type:Bachelor thesis/paper
Organization:FRI - Faculty of Computer and Information Science
Year:2021
PID:20.500.12556/RUL-130324 This link opens in a new window
COBISS.SI-ID:77669123 This link opens in a new window
Publication date in RUL:13.09.2021
Views:1235
Downloads:274
Metadata:XML RDF-CHPDL DC-XML DC-RDF
:
Copy citation
Share:Bookmark and Share

Secondary language

Language:English
Title:Topic and sentiment analysis of Slovene media using natural language processing tools
Abstract:
We compare topics covered by Slovenian media by analysing sentiment of the articles. We aim to analyse different stances of media towards specific political events or topics. We used LDA model for topic modeling and based on results, we selected articles with political content. For the sentiment analysis task we fine-tuned Slovenian SloBERTa model which we used to classify articles in one of three sentiment labels (positive, neutral, negative). We compare the media on a few political topics, where we notice differences between media. We present the results, highlight weaknesses of our system and suggest improvements.

Keywords:natural language processing, model BERT, latent Dirichlet allocation, topic modeling, sentiment detection, Slovenian media

Similar documents

Similar works from RUL:
Similar works from other Slovenian collections:

Back