Finding relations between entities in a text is an area of natural language processing. In the sentence: "Ljubljana is the capital of Slovenia" we want to find the relation capital between entities Ljubljana and Slovenia.
We first start with a review of the methods used for training models to predict relations. We then chose three methods with different approaches. The method with long short-term memory neural network, method which uses BERT encoder representations and method RECON which uses graph attention networks. To train
the models, we used the Slovenian corpus which was generated semi-automatically from the text of the Slovenian Wikipedia. We test the models on a test corpus of Slovenian Wikipedia and the test corpus of articles on 24ur.com. All three methods achieved high recall and precision for the test corpus of the Slovenian Wikipedia and the RECON method performed best. Results were worse on the test set of 24ur.com articles, where the method which used BERT encoder representations CroSloEngual achieved the best results.
|