izpis_h1_title_alt

Advancing RNA dynamics analysis through novel machine learning on multiomics data
ID Novljan, Jona (Author), ID Modic, Miha (Mentor) More about this mentor... This link opens in a new window, ID Chakrabarti, Anob (Comentor)

.pdfPDF - Presentation file, Download (2,85 MB)
MD5: BBC9AC4DF325606DD9E65A78C0B07299

Abstract
The RNA lifecycle greatly influences and directs the functionality of the cell. Its progression is orchestrated by multiple RNA-binding proteins (RBPs), forming complex and interconnected networks of transcript regulation. To disentangle this network and extract the RNA features influencing its fate, we are often presented with a multitude of single or multiomics information for the unique biological groups we are attempting to differentiate. Commonly, multiple statistical analyses and workflows are employed to manually extract and compare the differentiating features. However, this can introduce confirmation bias, and techniques able to compare the raw data need to be further explored. In this masters thesis, we have developed a machine learning (ML) workflow capable of classifying transcripts based on positional features such as sequence, structure, RNA-binding protein binding and methylation. Employing this workflow we were able to efficiently classify the stabilized and unstabilised transcripts by LIN28A in mouse embryonic stem cells (mESC), finding multivalent AU-rich regions toward the ends of 3’UTRs to be predictive of LIN28A mediated destabilization of transcripts. Expanding this methodology to include multiomics datasets in the second case study, we were able to extract the common features of the condensation prone RNA in mESC to be mainly structured C-rich coding regions with highly protein bound ends of 3’ untranslated regions. This demonstrated an effective use of ML to offer unique biological insights into the features governing different networks of RBP-RNA interactions. The use of such models is therefore anticipated to further expand the bioinformatics toolset and enable further unbiased view into diverse roles of RNA regulation with highly predictive and explainable ML models.

Language:English
Keywords:regulation of RNA expression, RNA-binding proteins, phase separation, machine learning, multiomic approaches, RNA characteristics
Work type:Master's thesis/paper
Typology:2.09 - Master's Thesis
Organization:BF - Biotechnical Faculty
Year:2024
PID:20.500.12556/RUL-161850 This link opens in a new window
COBISS.SI-ID:207607811 This link opens in a new window
Publication date in RUL:15.09.2024
Views:182
Downloads:172
Metadata:XML DC-XML DC-RDF
:
Copy citation
Share:Bookmark and Share

Secondary language

Language:Slovenian
Title:Izboljšanje analize dinamike RNA z uporabo strojnega učenja na multiomskih podatkih
Abstract:
Življenjski cikel RNA in njegova regulacija ima velik vpliv na pravilno delovanje celice. Njegov potek je organiziran preko več proteinov, ki vežejo RNA (RBP), in tvorijo kompleksna, med seboj povezana omrežja regulacije transkriptov. Da bi razvozlali to mrežo in izluščili značilnosti RNA, ki imajo največji vpliv na njeno usodo, pogosto analiziramo množico omskih informacij, da bi izluščili tiste, ki razložijo različen biološki odziv določenih RNA. Običajno zato uporabimo več različnih statističnih analiz, kjer ročno definiramo in primerjamo te značilnosti med skupinami. S takšno izbiro pa v analizo dodamo določeno pristranskost do naših hipotez. Zato je pomembno razviti nove načine analiz, ki lahko primerjajo še neobdelane podatke. V tej magistrski nalogi smo razvili protokol strojnega učenja (ML), ki omogoča razvrščanje transkriptov na podlagi značilnosti, kot so nukleotidno zaporedje, struktura, vezava na RBP in metilacija. Tako smo lahko učinkovito razvrstili z LIN28A stabilizirane in destabilizirane transkripte v izvornih celicah mišjih zarodkov (mESC), pri čemer smo ugotovili, da so z AU bogate multivalentne regije, ki se nahajajo ob koncu 3'UTR tiste, ki napovedujejo destabilizacijo. Z razširitvijo te metodologije na nabor več omskih podatkov smo uspeli izluščiti skupne značilnosti RNA, ki so nagnjene k kondenzaciji v mESC. Model je pokazal na strukturirane in s citozini bogate kodirajoče regije in z RBP vezane konce 3’ neprevedene regije, kot glavne lastnosti takšnih mRNA. Ta magistrska naloga torej prikaže dva primera učinkovite uporabe ML za pridobitev biološkega vpogleda v značilnosti, ki definirajo različna omrežja RBP-RNA interakcij. Predvidevamo, da bo v prihodnje uporaba takšnih modelov razširila nabor orodij za bioinformatske analize in omogočila nadaljni nepristranski vpogled v regulacijo RNA z visoko zmogljivimi in razložljivimi modeli ML.

Keywords:regulacija RNA ekspresije, RNA-vezavni proteini, fazna separacija, strojno učenje, multiomski pristopi, značilnosti RNA

Similar documents

Similar works from RUL:
Similar works from other Slovenian collections:

Back