Biological research is conducted yearly in the field of bioinformatics. However, their outcomes and insights remain scattered across different unconnected databases, that are often not accessible online. There is an increased interest in the science community to connect these datasets and uncover potential relationships. The thesis presents an algorithm and data structure for connecting multiple datasets, and thereby focuses on uncovering data relationships with the method of multimodal convolution autoencoder. The solution is evaluated by the DFMF matrix factorization alghorithm. The results show that encoding and decoding data to a common lower dimensional space reveals dependent data relationships.
|