izpis_h1_title_alt

Hate speech paraphraser
ID PESJAK, DREJC (Author), ID Bosnić, Zoran (Mentor) More about this mentor... This link opens in a new window, ID Robnik Šikonja, Marko (Co-mentor)

.pdfPDF - Presentation file, Download (753,50 KB)
MD5: 56A15C3E11790DFA42A53DD8ADE72F14

Abstract
There is plenty of hate speech on the web, which is additionally enabled by the possibility to remain anonymous, and many forums as well as news websites are trying to fight against it with a large number of moderators that remove hateful comments. Due to large numbers of daily comments they use automated hate speech detection software. We propose a DPhate system, which outputs an unhateful alternative to the posted hateful comment. The system uses a series of pre-trained paraphrasing models, that generate nonhateful sentences. The automatic evaluation has shown that in 84.37% of cases at least one acceptable sentence is generated, whereas only 67.90% of rephrasals were deemed acceptable by human evaluators.

Language:English
Keywords:hate speech, natural language processing, transformers, BERT models, machine learning, paraphrasing
Work type:Bachelor thesis/paper
Typology:2.11 - Undergraduate Thesis
Organization:FRI - Faculty of Computer and Information Science
Year:2022
PID:20.500.12556/RUL-135582 This link opens in a new window
COBISS.SI-ID:102617091 This link opens in a new window
Publication date in RUL:21.03.2022
Views:947
Downloads:132
Metadata:XML RDF-CHPDL DC-XML DC-RDF
:
Copy citation
Share:Bookmark and Share

Secondary language

Language:Slovenian
Title:Parafraziranje sovražnega govora
Abstract:
Splet je poln sovražnega govora, ki ga dodatno spodbuja možnost anonimnosti. Mnogi forumi in novičarske spletne strani se branijo z moderatorji, ki odstranijo škodljive komentarje. Ker je po navadi komentarjev veliko (več deset tisoč na dan), si moderatorji pomagajo s programi za avtomatsko zaznavanje sovražnega govora. V svoji diplomski nalogi predlagamo nov sistem DPhate, ki uporabniku ob objavi sovražnega komentarja predlaga nesovražno alternativo z ohranjenim pomenom. V sistemu uporabimo več prednaučenih modelov, ki s parafraziranjem generirajo nesovražne povedi. Avtomatska evalvacija je pokazala, da se v 84.37% generira vsaj en primeren stavek, medtem ko so generirane parafraze človeški evalvatorji ocenili za primerne v 67.90%.

Keywords:sovražni govor, obdelava naravnega jezika, transformerji, modeli BERT, strojno učenje, parafraziranje

Similar documents

Similar works from RUL:
Similar works from other Slovenian collections:

Back