Boolean networks provide an efficient and compact framework for modeling the interactions and dynamics of gene regulatory networks. However, inferring Boolean descriptions of these networks remains a challenging task. Gene expression datasets often contain significantly fewer experiments and time points than the number of genes in the regulatory network. Additionally, missing values and inherent noise in gene expression data further complicate the inference process. The binarization of gene expression data, while necessary for Boolean modeling, can lead to information loss and oversimplification of complex dynamics, resulting in erroneous or inconsistent data. Consequently, inferring Boolean models solely from binarized gene expression data often produces overfitted, complex models with low biological relevance.
To obtain more biologically relevant Boolean descriptions of gene regulatory networks, we incorporate multiple data sources and prior knowledge, particularly regarding the expected topological structure and properties of the inferred networks. In this thesis, we present SAILoR (Structure-Aware Inference of Logic Rules), a novel inference method designed to improve the accuracy and robustness of Boolean network reconstruction.
Our method integrates both continuous and binarized time-series gene expression data, combining them with prior knowledge in the form of reference networks to infer accurate and biologically relevant models of gene regulatory networks. SAILoR automatically extracts topological properties from these reference networks, which can either represent general structural expectations or specific interactions. By balancing two main objectives, topological similarity to the reference network and consistency with gene expression data, SAILoR effectively navigates the trade-offs in Boolean network inference. Utilizing the multi-objective genetic algorithm NSGA-II, our approach follows the wisdom of the crowds principle, simultaneously inferring multiple networks and selecting the best-performing one. Our results demonstrate that SAILoR can infer gene regulatory networks that are both structurally and dynamically accurate, enhancing their biological relevance.
Through extensive simulations and comparisons with existing methods, we demonstrate that SAILoR enhances structural correctness compared to dynGENIE3 while preserving dynamic accuracy. Additionally, we compare our method to other Boolean inference approaches, including Best-Fit Extension, REVEAL, MIBNI, GABNI, ATEN, and LogBTF. Our results show that, compared to these methods, SAILoR produces models with improved structural accuracy while maintaining dynamic consistency. Furthermore, we apply SAILoR to infer context-specific gene regulatory subnetworks in female D. melanogaster before and after mating, demonstrating its applicability in real-world biological scenarios.
|