izpis_h1_title_alt

Izdelava algoritma za zanesljivo prepoznavo kratkih označevalnih sekvenc DNK ob visoki stopnji napak pri sekvenciranju
ID Močivnik, Luka (Author), ID Skrbinšek, Tomaž (Mentor) More about this mentor... This link opens in a new window

.pdfPDF - Presentation file, Download (2,79 MB)
MD5: AC65139F2E870636D871535BB0E8EF2A
.zipZIP - Appendix, Download (132,25 MB)
MD5: 9297C629E4A10665C628FF09E47A8A0A

Abstract
Tehnologije sekvenciranja tretje generacije, zlasti tehnologija nanopor, omogočajo hitro sekvenciranje dolgih sekvenc DNA. Njihova slaba lastnost so visoka stopnja napak. V magistrski nalogi predstavljamo bioinformatski cevovod za obdelavo sekvenc mikrosatelitov, pridobljenih s sekvenciranjem tretje generacije. Za preizkus cevovoda smo uporabili sekvence mikrosatelitov, pridobljene iz neinvazvnih genetskih vzorcev rjavega medveda (Ursus arctos) in sekvencirane s sekvenatorjem Illumina. V njih smo simulirali substitucije, insercije in delecije v različnih kombinacijah ter ob različni stopnji skupnih napak. Poleg že uporabljenih DNA- oznak vzorcev dolžine 8 bp smo preizkusili še oznake dolžin 12 in 16 bp. Bioinformatski cevovod se je izkazal za učinkovitega samo s substitucijami, pri simuliranih vseh treh vrstah napak pa ne. Kljub temu smo ugotovili, da so trenutno uporabljene oznake dolžine 8 bp pri visokih stopnjah napak, posebej pri simuliranih vseh treh vrstah, neuporabne in je za uspešno identifikacijo vzorcev potrebna uporaba daljših, preferenčno 16 bp dolgih oznak. Ugotovili smo tudi, da se težave lahko pojavijo pri iskanju oligonukleotidnih začetnikov in posledično identifikaciji lokusov, ki jih označujejo. Našli smo šibke točke v cevovodu in predlagamo možne rešitve. Predstavljeni bioinformatski cevovod je tako primeren kot podlaga na nadaljnje delo.

Language:Slovenian
Keywords:sekvenciranje tretje generacije, sekvenciranje z visokimi stopnjami napak, mikrosateliti, kratke sekvence DNA
Work type:Master's thesis/paper
Organization:BF - Biotechnical Faculty
Year:2021
PID:20.500.12556/RUL-125033 This link opens in a new window
COBISS.SI-ID:55304195 This link opens in a new window
Publication date in RUL:02.03.2021
Views:916
Downloads:172
Metadata:XML RDF-CHPDL DC-XML DC-RDF
:
Copy citation
Share:Bookmark and Share

Secondary language

Language:English
Title:Algorithm for reliable recognition of short DNA tag sequences in presence of high sequencing error rates
Abstract:
Third-generation sequencing technologies, especially nanopores, present the possibility of fast sequencing DNA and obtaining long reads. Their downsides are high error rates. In this thesis, we present a bioinformatics pipeline for processing microsatellite sequences obtained using third-generation sequencing. For testing, we used brown bear (Ursus arctos) microsatellite sequences obtained from non-invasive genetic samples. They were sequenced on the Illumina platform. In these sequences, we simulated substitutions, insertions, and deletions with various combinations and different total error rates. Aside from the previously used 8 bp DNA tags for sample marking, we also tested longer 12 and 16 bp tags. Our bioinformatics pipeline was effective when dealing with substitutions only. It was ineffective when all three error types were simulated. Nonetheless, we found that the currently used 8 bp tags are not useful at high error rates, especially when dealing with all three error types. We also found issues with the primer search, and, consequently, identification of loci that are marked by the primers. We identified weak points in the pipeline and thus suggest possible solutions. The presented bioinformatics pipeline should therefore provide a useful basis for further work.

Keywords:third generation sequencing, sequencing with high error rates, microsatellites, short DNA sequences

Similar documents

Similar works from RUL:
Similar works from other Slovenian collections:

Back