Many complex real-world systems can be modeled as heterogeneous networks. Link prediction in such networks can be used to detect missing information or predict future relationships based on currently observed connections. In the thesis, we compare various methods for the task of link prediction on heterogeneous networks. We implement four different models, all based on embeddings of nodes in vector space. We compare a simple model with manually selected link features, a method based on random walks on meta paths in the graph and an autoencoder model with graph convolutional networks for homogenous networks and its adaptation for heterogeneous networks. Area under ROC curve is used to evaluate algorithms' performance. We conduct experiments on four real-world datasets, resulting in various edge types to test on. To measure if the results between classifiers are statistically significant, non-parametric statistical tests such as the Friedman test and post-hoc Nemenyi test are used. Results show that graph autoencoder model modified for heterogeneous networks outperforms other methods. The main drawback of deep learning models is, that they are not interpretable and their process of decision making cannot be explained. In some cases, it is better to use them in combination with other methods.
|