izpis_h1_title_alt

Modeling cell-to-cell gene expression variability from DNA sequences
ID Kojanec, Patrik (Author), ID Curk, Tomaž (Mentor) More about this mentor... This link opens in a new window, ID Sanguinetti, Guido (Co-mentor)

.pdfPDF - Presentation file, Download (2,02 MB)
MD5: E7937A0F4153200418D287854BC595BE

Abstract
Cell-to-cell variability is often associated with cell differentiation in embryo development or cancer outbursts. Although some of the variability in single-cell RNA sequencing (scRNA-seq) experiments is derived from technical noise, a significant proportion is still attributed to the biological processes within the cell. In this Master's thesis, we propose a novel approach to predict cell-to-cell gene expression variability and mean expression directly from the DNA sequence. For this purpose, we use the Enformer, a deep learning transformer model, to embed the DNA sequence into a more favorable feature space, from which we predict the mean expression and overdispersion of scRNA gene expression. We evaluated our approach on the mouse and human data gathered with two scRNA-seq protocols. Our approach can explain up to 60% and 25% of the variance of overdispersion in mouse and human datasets, respectively. Furthermore, in the thesis, we address the changes in the performance of our models caused by the differences in the scRNA-seq protocols.

Language:English
Keywords:scRNA-seq, gene expression variability, deep learning
Work type:Master's thesis/paper
Typology:2.09 - Master's Thesis
Organization:FRI - Faculty of Computer and Information Science
Year:2022
PID:20.500.12556/RUL-141443 This link opens in a new window
COBISS.SI-ID:124837891 This link opens in a new window
Publication date in RUL:29.09.2022
Views:671
Downloads:98
Metadata:XML RDF-CHPDL DC-XML DC-RDF
:
Copy citation
Share:Bookmark and Share

Secondary language

Language:Slovenian
Title:Modeliranje variabilnosti genskega izražanja posameznih celic na podlagi sekvenc DNA
Abstract:
Variabilnost genskega izražanja večkrat povezujemo z dejavniki, ki uravnavajo celično diferenciacijo v zgodnjih fazah embrionalnega razvoja ali pa tvorbo rakavih celic. Variabilnost genskega izražanja posameznih celic lahko merimo z meritvami scRNA-seq, ki pa so zaradi tehničnih pomanjkljivosti zelo šumne. V magistrski nalogi predstavimo inovativen pristop za napoved variabilnosti genskega izražanja na podlagi genskih zaporedij DNA. Pri tem smo uporabili model globokega strojnega učenja Enformer, ki zaporedja DNA vloži v bolj učinkovit prostor značilk. Z uporabo linearnih modelov nato iz vložitev sekvenc napovemo povprečno gensko izražanje in razpršenost podatkov scRNA-seq. Predlagani pristop smo ovrednotili na podatkih dveh različnih organizmov, pridobljenih z dvema različnima protokoloma scRNA-seq. S predlaganim pristopom lahko pojasnimo do 60% variance razpršenosti genskega izražanja na naboru podatkov o miših in 25% na naboru človeških podatkov.

Keywords:scRNA-seq, variabilnost genskega izražanja, globoko strojno učenje

Similar documents

Similar works from RUL:
Similar works from other Slovenian collections:

Back