Details

Negativno zaznamovano besedišče v Slovarju sopomenk sodobne slovenščine 2.0
ID Arhar Holdt, Špela (Author), ID Kosem, Iztok (Author), ID Pori, Eva (Author), ID Gorjanc, Vojko (Author), ID Krek, Simon (Author), ID Gantar, Polona (Author)

.pdfPDF - Presentation file, Download (410,00 KB)
MD5: DDE168F71F99A666617DC7E018345EFB
URLURL - Source URL, Visit https://journals.uni-lj.si/slovenscina2/article/view/12062 This link opens in a new window

Abstract
V prispevku predstavljamo rešitve za prepoznavanje in označevanje zaznamovanega besedišča v okviru koncepta odzivnega Slovarja sopomenk sodobne slovenščine. Ker gre za prvi tovrstni projekt, so pripravljene rešitve v veliki meri inovativne, umeščene pa v okvir problematike avtomatske strojne izdelave slovarja, njegove odprtosti in vključenosti uporabniške skupnosti. Prispevek prikazuje postopek prepoznavanja sovražnega in grobega besedišča ter pripis oznak, opozorilnih ikon in daljših pojasnil. Ukvarjamo se tako s tehničnimi kot vsebinskimi vprašanji označevanja. Vsebinsko oznake temeljijo na sporo-čanjskem namenu in učinku, pri čemer je njihovo bistvo informacija o možnih posledicah rabe, pri tehničnih rešitvah pa veliko pozornost posvečamo digitalnemu mediju in vizualizaciji rešitev v njem. Ker je odzivnost eden ključnih konceptov slovarja, se pri rešitvah glede označevanja zavedamo pomembnosti sodelovanja z uporabniško skupnostjo, zato tudi pri dodajanju oznak predla-gamo rešitve za sodelovanje s skupnostjo. Izhodiščni konferenčni prispevek je bil razširjen v vseh poglavjih, dodano pa je povsem novo poglavje o obdelavi večpomenskih iztočnic, njihovi pomenski členitvi in pomenskem opisovanju z zgledi pomenov z negativno zaznamovanostjo.

Language:Slovenian
Keywords:slovenščina, slovar sopomenk, odzivni slovar, slovarske oznake, sporočanjski namen, uporabniška skupnost
Work type:Article
Typology:1.01 - Original Scientific Article
Organization:FF - Faculty of Arts
FRI - Faculty of Computer and Information Science
Publication status:Published
Publication version:Version of Record
Year:2023
Number of pages:Str. 8-32
Numbering:Letn. 11, št. 1
PID:20.500.12556/RUL-166522 This link opens in a new window
UDC:811.163.6'373.421'374
ISSN on article:2335-2736
DOI:10.4312/slo2.0.2023.1.8-32 This link opens in a new window
COBISS.SI-ID:165689347 This link opens in a new window
Publication date in RUL:16.01.2025
Views:353
Downloads:95
Metadata:XML DC-XML DC-RDF
:
Copy citation
Share:Bookmark and Share

Record is a part of a journal

Title:Slovenščina 2.0 : empirične, aplikativne in interdisciplinarne raziskave
Publisher:Trojina, zavod za uporabno slovenistiko, Trojina, zavod za uporabno slovenistiko, Trojina, zavod za uporabno slovenistiko, Znanstvena založba Filozofske fakultete, Znanstvena založba Filozofske fakultete, Založba Univerze v Ljubljani
ISSN:2335-2736
COBISS.SI-ID:264547328 This link opens in a new window

Licences

License:CC BY-SA 4.0, Creative Commons Attribution-ShareAlike 4.0 International
Link:http://creativecommons.org/licenses/by-sa/4.0/
Description:This Creative Commons license is very similar to the regular Attribution license, but requires the release of all derivative works under this same license.

Secondary language

Language:English
Title:Negative vocabulary in the thesaurus of modern Slovene 2.0
Abstract:
The paper describes an upgraded version of the Thesaurus of Modern Slovene 1.0, which is currently the largest open-access collection of Slovene synonyms generated automatically. The creation of the thesaurus has introduced a new type of dictionary, referred to as a responsive dictionary, which allows the data to respond continuously to the opinions of the contributing language community. The upgrade was motivated by the results of a survey of the user community’s attitudes towards the Thesaurus of Modern Slovene, which revealed a lack of dictionary labels, particularly for non-neutral vocabulary. As a result, the updated version of the thesaurus focuses on developing solutions for identifying and annotating extremely offensive and vulgar vocabulary. To address this, the digital medium is utilized to display information about potentially problematic vocabulary in new ways. The updated version of the thesaurus incorporates a combination of warning icons and longer explanations to provide a clear visual tag as well as an explanation about the potential consequences of word use. The identification of potentially negative words was primarily conducted manually. Synonym sets were exported from the dictionary database, ordered in semantic clusters, and reviewed by students who were provided with brief instructions to identify potentially negative words, such as elements of hate speech (discrimination based on race, ethnicity, gender, sexual orientation, or disability), negative attitudes (related to social status, wealth, behaviour and character, appearance, etc.), and vulgarity (related to taboo topics, e.g., sexuality, bodily excretions, and violence, in the typical informal speech situation). The decisions made by the students were reviewed and modified by a team of linguists, based on corpus data. As responsiveness is a key concept of the thesaurus, involving the user community in future labelling procedures is an important part of the preparation of final labelling solutions.

Keywords:Slovene, thesaurus, responsive dictionary, dictionary labels, communicative purpose, user community

Projects

Funder:Other - Other funder or multiple funders
Funding programme:Ministrstvo za kulturo Republike Slovenije
Project number:SOKOL
Name:Nadgradnja temeljnih slovarskih virov in podatkovnih baz CJVT UL

Funder:ARIS - Slovenian Research and Innovation Agency
Project number:P6-0411
Name:Jezikovni viri in tehnologije za slovenski jezik

Funder:ARIS - Slovenian Research and Innovation Agency
Project number:P6-0215
Name:Slovenski jezik - bazične, kontrastivne in aplikativne raziskave

Funder:ARIS - Slovenian Research and Innovation Agency
Project number:P6-0436
Name:Digitalna humanistika: viri, orodja in metode

Similar documents

Similar works from RUL:
Similar works from other Slovenian collections:

Back