Globoko učenje na genomskih in filogenetskih podatkih

Mrzelj, Nina

Globoko učenje na genomskih in filogenetskih podatkih
ID Mrzelj, Nina (Author), ID Zupan, Blaž (Mentor) More about this mentor... This link opens in a new window

PDF - Presentation file, Download (656,50 KB)
MD5: 1295D0BD55A757638471575D297D9BBE
PID: 20.500.12556/rul/b5538b15-b119-40eb-a4c0-3f69ba914f4d

Abstract

Metode globokega učenja v praksi dosegajo izjemne rezultate pri reševanju problemov na različnih področjih, med drugim tudi v genomiki. V diplomski nalogi smo se ukvarjali z razvrščanjem genskih zaporedij bakterij v taksonomske razrede. Cilj je bil zgraditi model, ki bo znal bakterijo na podlagi zaporedja njenega gena 16S rRNA razvrstiti v pravo deblo, razred, red, družino in rod. Z uporabo metod globokega učenja smo zgradili več klasifikacijskih modelov in vrednotili njihovo uspešnost na podlagi klasifikacijske točnosti in mere F1. Med seboj smo primerjali konvolucijske nevronske mreže, preproste rekurenčne nevronske mreže, dvosmerne rekurenčne nevronske mreže, kombinirane modele z rekurenčnimi in konvolucijskimi nevronskimi mrežami ter metodo naključnih gozdov. Eksperimente smo izvedli na dveh različno velikih množicah podatkov, preverili pa smo tudi, kako se modeli obnesejo pri klasifikaciji, če imajo na voljo le krajši del genskega zaporedja. Rezultati kažejo, da so za reševanje tovrstnih problemov najbolj primerne konvolucijske nevronske mreže.

Language:	Slovenian
Keywords:	globoko učenje, klasifikacija, nevronske mreže
Work type:	Bachelor thesis/paper
Organization:	FRI - Faculty of Computer and Information Science
Year:	2016
PID:	20.500.12556/RUL-85515
Publication date in RUL:	15.09.2016
Views:	3655
Downloads:	601
Metadata:
:	Copy citation
Share:

Secondary language

Abstract:
Language:	English
Title:	Deep learning on genomic and phylogenetic data
Deep learning methods have been achieving amazing results in solving a variety of problems in many different fields, a very important one of them being genomics. In the thesis, deep learning methods have been used to classify bacterial DNA sequences into taxonomic ranks. The goal was to build a classification model based on the bacteria's 16S rRNA sequence and classify a bacteria by phylum, class, order, family and genus. The performance of five different models has been compared in terms of accuracy and F1 score. A model with convolutional neural networks, simple recurrent neural network, bidirectional neural network, a hybrid model that combines convolutional and neural network and a model using random forests have been built. Two experiments have been conducted. In the first one classification was based on the whole sequence. In the second one only a small sequence fragment was used. We evaluated the performance of the models based on two datasets of different sizes. Results show that convolutional neural networks outperformed other models in all the cases.
Keywords:	deep learning, classification, neural networks

Similar works from RUL:
Similar works from other Slovenian collections:

Secondary language

Similar documents