Details

Knots and $\theta$-curves identification in polymeric chains and native proteins using neural networks
ID da Silva, Fernando Bruno (Author), ID Gabrovšek, Boštjan (Author), ID Korpacz, Marta (Author), ID Luczkiewicz, Kamil (Author), ID Niewieczerzal, Szymon (Author), ID Sikora, Maciej (Author), ID Sulkowska, Joanna I. (Author)

.pdfPDF - Presentation file, Download (3,47 MB)
MD5: 43918B104CED420C3C162B307FB6B591
URLURL - Source URL, Visit https://pubs.acs.org/doi/10.1021/acs.macromol.3c02479 This link opens in a new window

Abstract
Entanglement in proteins is a fascinating structural motif that is neither easy to detect via traditional methods nor fully understood. Recent advancements in AI-driven models have predicted that millions of proteins could potentially have a nontrivial topology. Herein, we have shown that long short-term memory (LSTM)-based neural networks (NN) architecture can be applied to detect, classify, and predict entanglement not only in closed polymeric chains but also in polymers and protein-like structures with open knots, actual protein configurations, and also ▫$\theta$▫-curves motifs. The analysis revealed that the LSTM model can predict classes (up to the ▫$6_1$▫ knot) accurately for closed knots and open polymeric chains, resembling real proteins. In the case of open knots formed by protein-like structures, the model displays robust prediction capabilities with an accuracy of 99%. Moreover, the LSTM model with proper features, tested on hundreds of thousands of knotted and unknotted protein structures with different architectures predicted by AlphaFold 2, can distinguish between the trivial and nontrivial topology of the native state of the protein with an accuracy of 93%.

Language:English
Keywords:machine learning, topology, protein databases, entanglements, open knots, closed knots
Work type:Article
Typology:1.01 - Original Scientific Article
Organization:FS - Faculty of Mechanical Engineering
FMF - Faculty of Mathematics and Physics
Publication status:Published
Publication version:Version of Record
Year:2024
Number of pages:Str. 4599-4608
Numbering:Vol. 57, iss. 9
PID:20.500.12556/RUL-166791 This link opens in a new window
UDC:004.85:004.725.4
ISSN on article:0024-9297
DOI:10.1021/acs.macromol.3c02479 This link opens in a new window
COBISS.SI-ID:194735875 This link opens in a new window
Publication date in RUL:24.01.2025
Views:665
Downloads:158
Metadata:XML DC-XML DC-RDF
:
Copy citation
Share:Bookmark and Share

Record is a part of a journal

Title:Macromolecules
Shortened title:Macromolecules
Publisher:American Chemical Society
ISSN:0024-9297
COBISS.SI-ID:25886464 This link opens in a new window

Licences

License:CC BY 4.0, Creative Commons Attribution 4.0 International
Link:http://creativecommons.org/licenses/by/4.0/
Description:This is the standard Creative Commons license that gives others maximum freedom to do what they want with the work as long as they credit the author.

Secondary language

Language:Slovenian
Keywords:strojno učenje, topologija, proteinska baza podatkov, zavozlanost, odprti vozli, sklenjeni vozli

Projects

Funder:ARIS - Slovenian Research and Innovation Agency
Project number:N1-0278-2023
Name:Biološka koda vozlov - identifikacija vzorcev vozlanja v biomolekulah z uporabo umetne inteligence

Funder:Other - Other funder or multiple funders
Funding programme:NCN - National Science Centre, Poland
Project number:2021/43/I/NZ1/03341

Funder:Other - Other funder or multiple funders
Funding programme:NCN - National Science Centre, Poland
Project number:2022/47/B/NZ1/03480

Similar documents

Similar works from RUL:
Similar works from other Slovenian collections:

Back