Vaš brskalnik ne omogoča JavaScript!
JavaScript je nujen za pravilno delovanje teh spletnih strani. Omogočite JavaScript ali pa uporabite sodobnejši brskalnik.
Nacionalni portal odprte znanosti
Odprta znanost
DiKUL
slv
|
eng
Iskanje
Brskanje
Novo v RUL
Kaj je RUL
V številkah
Pomoč
Prijava
To BAN or not to BAN : Bayesian attention networks for reliable hate speech detection
ID
Miok, Kristian
(
Avtor
),
ID
Škrlj, Blaž
(
Avtor
),
ID
Zaharie, Daniela
(
Avtor
),
ID
Robnik Šikonja, Marko
(
Avtor
)
PDF - Predstavitvena datoteka,
prenos
(3,06 MB)
MD5: 10D28EFDE9A05A29C78D6640B6BADA48
URL - Izvorni URL, za dostop obiščite
https://link.springer.com/article/10.1007/s12559-021-09826-9
Galerija slik
Izvleček
Hate speech is an important problem in the management of user-generated content. To remove offensive content or ban misbehaving users, content moderators need reliable hate speech detectors. Recently, deep neural networks based on the transformer architecture, such as the (multilingual) BERT model, have achieved superior performance in many natural language classification tasks, including hate speech detection. So far, these methods have not been able to quantify their output in terms of reliability. We propose a Bayesian method using Monte Carlo dropout within the attention layers of the transformer models to provide well-calibrated reliability estimates. We evaluate and visualize the results of the proposed approach on hate speech detection problems in several languages. Additionally, we test whether affective dimensions can enhance the information extracted by the BERT model in hate speech classification. Our experiments show that Monte Carlo dropout provides a viable mechanism for reliability estimation in transformer networks. Used within the BERT model, it offers state-of-the-art classification performance and can detect less trusted predictions.
Jezik:
Angleški jezik
Ključne besede:
natural language processing
,
machine learning
,
transformer neural networks
,
Bayesian neural networks
,
BERT models
,
prediction uncertainty
,
reliability estimation
,
Monte Carlo dropout
,
Bayesian BERT
,
sentic computing
,
model calibration
Vrsta gradiva:
Članek v reviji
Tipologija:
1.01 - Izvirni znanstveni članek
Organizacija:
FRI - Fakulteta za računalništvo in informatiko
Status publikacije:
Objavljeno
Različica publikacije:
Objavljena publikacija
Leto izida:
2022
Št. strani:
Str. 353-371
Številčenje:
Vol. 14, iss. 1
PID:
20.500.12556/RUL-144813
UDK:
004.85:81'322.2
ISSN pri članku:
1866-9956
DOI:
10.1007/s12559-021-09826-9
COBISS.SI-ID:
80879363
Datum objave v RUL:
14.03.2023
Število ogledov:
801
Število prenosov:
82
Metapodatki:
Citiraj gradivo
Navadno besedilo
BibTeX
EndNote XML
EndNote/Refer
RIS
ABNT
ACM Ref
AMA
APA
Chicago 17th Author-Date
Harvard
IEEE
ISO 690
MLA
Vancouver
:
Kopiraj citat
Objavi na:
Gradivo je del revije
Naslov:
Cognitive computation
Skrajšan naslov:
Cogn. comput.
Založnik:
Springer Nature
ISSN:
1866-9956
COBISS.SI-ID:
80861443
Licence
Licenca:
CC BY 4.0, Creative Commons Priznanje avtorstva 4.0 Mednarodna
Povezava:
http://creativecommons.org/licenses/by/4.0/deed.sl
Opis:
To je standardna licenca Creative Commons, ki daje uporabnikom največ možnosti za nadaljnjo uporabo dela, pri čemer morajo navesti avtorja.
Sekundarni jezik
Jezik:
Slovenski jezik
Ključne besede:
obdelava naravnega jezika
,
strojno učenje
,
nevronske mreže transformer
,
bayesovske nevronske mreže
,
modeli BERT
Projekti
Financer:
ARRS - Agencija za raziskovalno dejavnost Republike Slovenije
Številka projekta:
P6-0411
Naslov:
Jezikovni viri in tehnologije za slovenski jezik
Financer:
EC - European Commission
Program financ.:
H2020
Številka projekta:
825153
Naslov:
Cross-Lingual Embeddings for Less-Represented Languages in European News Media
Akronim:
EMBEDDIA
Podobna dela
Podobna dela v RUL:
Podobna dela v drugih slovenskih zbirkah:
Nazaj