
Luščenje slovenskih in angleških označevalcev semantičnih relacij iz specializiranega korpusa, njihova produktivnost in natančnost
ID Hadalin, Teja (Author), ID Vintar, Špela (Mentor)

.pdfPDF - Presentation file
MD5: 03FD5A56C0081B9E4BD22243EEACDD2B

V magistrskem delu smo analizirali produktivnost in natančnost leksikalnih označevalcev, ki uvajajo različne semantične relacije. V prvi fazi analize smo s poizvedbami iz korpusa, specializiranega za področje krasoslovja, izluščili primere, ki vsebujejo izbrane označevalce. Rezultate iskanja v korpusu smo ročno preverili in ocenili, ali označevalci res uvajajo specifično relacijo. Ugotovili smo, da so analizirani označevalci učinkoviti, saj so v povprečju dosegli več kot 50-% natančnost. Na drugi strani je analiza pokazala, da so označevalci lahko večpomenski in izražajo različne relacije, kar lahko predstavlja težave pri razvoju metod avtomatskega luščenja relacij. Da bi preverili učinkovitost označevalcev v splošnih besedilih, smo njihovo natančnost analizirali še v slovenskem korpusu splošnih besedil in ugotovili, da je njihova uspešnost primerljiva s tisto iz specializiranega korpusa. Z raziskavo smo želeli prispevati k razvoju različnih metod obdelave naravnega jezika, zlasti na področju (pol)avtomatskega luščenja relacij iz slovenskih besedil.

Keywords:leksikalni označevalci, semantične relacije, vzorci znanja, luščenje relacij, natančnost in produktivnost
Work type:Master's thesis/paper
Organization:FF - Faculty of Arts
PID:20.500.12556/RUL-142357 This link opens in a new window
COBISS.SI-ID:130845955 This link opens in a new window
Publication date in RUL:02.11.2022
Secondary language

Title:Extraction of Semantic Relation Markers in Slovene and English, their productivity and accuracy
In this master’s thesis, we analysed the productivity and accuracy of lexical markers introducing different semantic relations. In the first stage of our analysis, we carried out different search queries in a corpus specialized in karstology, in order to gather examples of use of the studied semantic relations. Then, we manually evaluated the obtained results and determined whether a certain marker truly indicates the right semantic relation. Based on our analysis, the markers were deemed efficient, with an average accuracy score of more than 50%. On the other hand, some markers turned out to be more ambiguous than others, with an ability to express different types of relations, which can hinder the development of automatic relation extraction methods. In order to test the applicability of lexical markers in more general texts, we performed a short analysis of their accuracy in a corpus of written standard Slovene, which provided similar results as the domain-specific one. This thesis aims to contribute to the development of different natural language processing (NLP) techniques, especially in terms of (semi)automatic relation extraction from Slovene texts.

Keywords:lexical markers, semantic relations, knowledge patterns, relation extraction, accuracy and productivity

