Machine learning is a branch of artificial intelligence that enables the analysis of large datasets, recognition of complex patterns, and prediction of outcomes, placing it at the forefront of modern drug design. Indoleamine 2,3-dioxygenase 1 (IDO1) is an enzyme that breaks down tryptophan into kynurenine and plays a crucial role in immunosuppression, allowing cancer cells to evade the
immune response. IDO1 inhibitors, therefore, represent an important target in cancer immunotherapy; however, no inhibitor has yet received clinical approval. Machine learning is highly advantageous in designing novel IDO1 inhibitors, as it facilitates efficient analysis of
chemical datasets, prediction of compound biological activity, and optimization of potential inhibitor properties. In this study, we employed innovative machine learning approaches to identify potential IDO1 inhibitors. The research was divided into three main stages. In the first stage, we collected, prepared, and analyzed data from public cheminformatics databases such as ChEMBL and BindingDB, creating a curated dataset containing information on molecular structure and compound activity. We then statistically analyzed the physicochemical properties of active and inactive molecules. In the second stage, we developed machine learning models using molecular
fingerprints to classify compounds as active or inactive. Various combinations of fingerprints, algorithms, and preprocessing techniques were tested to determine the optimal solution. In the final stage, the selected model was applied to compounds from the Molport database, analyzing
the properties of predicted inhibitors and identifying the most promising molecules. The results showed statistically significant differences in the physicochemical properties of
existing active and inactive molecules, with key distinctions observed in molecular mass, logP values, the number of nitrogen atoms, and other parameters. Furthermore, the structural features of the most promising predicted molecules correspond to structural elements found in compounds already included in clinical trials.
|