The goal of this work is to develop a methodology for sense-based synonym and antonym detection. We are seeking to answer the question whether pairs of words in given contexts are synonyms or antonyms.
Our approach includes sense clustering on a set of words in contexts, determining a matching sense of a candidate word pair, and two separate models for contextual synonym and antonym classification. We use contextual word embeddings from BERT models which represent information on words and their context. Everything listed has a potential use in lexicography, machine text translation, automated text summarization and information extraction.
Best scored word sense clustering achieves average ARI score of 0.30. Our best methodology for determining sense pairs reaches classification accuracy of 0.78 on synonyms and 0.73 on antonyms. The best CroSloEngual BERT-based model for antonym detection has 90 % precision, 61 % recall and 60 % accuracy, the best model for synonym detection has 99 % precision, 50 % recall in 51 % accuracy.
|