Detekcija antonimov z vektorskimi vložitvami besed

Pegan, Jasmina

Detekcija antonimov z vektorskimi vložitvami besed
ID Pegan, Jasmina (Author), ID Robnik Šikonja, Marko (Mentor) More about this mentor... This link opens in a new window

, ID Gantar, Apolonija (Comentor)

PDF - Presentation file, Download (491,01 KB)
MD5: 5C2D93AC6A646294D33819A64AD6B083

Abstract

Cilj diplomske naloge je razvoj klasifikatorja za prepoznavo protipomenk. Za izdelavo rešitve je bila uporabljena baza vnaprej pripravljenih vektorskih vložitev besed za slovenščino. Najprej smo sestavili učno množico protipomenk in sopomenk. Sledilo je iskanje čimbolj ustreznega klasifikacijskega modela. Ogledali smo si nekaj modelov metode podpornih vektorjev in nekaj globokih nevronskih mrež. Izbranim besedam smo poiskali pomensko sorodne besede in na njih uporabili naučeni model. Tako smo pridobili kandidate za pare protipomenk in sopomenk. Točnost rezultatov smo ocenili na testni množici. Najbolje ocenjeni model dosega klasifikacijsko točnost 0.70.

Language:	Slovenian
Keywords:	protipomenke, sopomenke, vektorske vložitve besed, strojno učenje, klasifikacija
Work type:	Bachelor thesis/paper
Organization:	FRI - Faculty of Computer and Information Science
Year:	2019
PID:	20.500.12556/RUL-110533
COBISS.SI-ID:	1538361795
Publication date in RUL:	16.09.2019
Views:	2131
Downloads:	285
Metadata:
:	Copy citation
Share:

Secondary language

Abstract:
Language:	English
Title:	Antonym detection with word embeddings
This thesis aims to develop a classifier for antonym detection. A database of pre-made word embeddings for Slovene was used to create the solution. First we collected a learning set consisting of synonyms and antonyms. Then we searched for the most appropriate classification model. We observed some support vector machine models and some deep neural networks. We applied the learned model to groups of words closest to the selected words. Thus, we obtained candidates for pairs of synonyms and antonyms. The accuracy of the results set was evaluated on the test set. The top rated model reaches classification accuracy of 0.70.
Keywords:	antonyms, synonyms, word embeddings, machine learning, classification

Similar works from RUL:
Similar works from other Slovenian collections:

Secondary language

Similar documents