The result of a bioequivalence study is a decisive milestone in the development of a generic drug. Many variables influence this outcome and make the assessment of bioequivalence risk complex. In the early stages of development, there is usually little data on the pharmaceutical form, critical material attributes, process parameters and in vitro-in vivo correlations (IVIVC), but a lot of information on the active pharmaceutical ingredient (API). In the literature, there is no methodology for a data-based yet non-complex bioequivalence risk assessment in the early development phases of generic drug development. Therefore, the aim of this dissertation was to develop a methodology to predict the bioequivalence outcome and use it to assess bioequivalence risk in early stages of drug development.
We started by identifying properties of the API that significantly influence the bioequivalence outcome and are accessible early in the early development phase. In a dataset of 198 bioequivalence studies (conducted under Sandoz sponsorship), we confirmed the importance of the biopharmaceutics cassification system (BCS) and a lower bioequivalence risk for highly soluble compounds. In a subset of 128 bioequivalence studies with poorly soluble APIs, significant parameters included absolute bioavailability (30% of studies failed when bioavailability was below 40%), permeability, time to maximum plasma concentration, lipophilicity, first-pass metabolism and acid-base properties, and associated solubility at relevant pH values in the gastrointestinal tract.
We used the subset of 128 bioequivalence studies with the optimal set of properties of poorly soluble compounds to develop and compare methods for predicting bioequivalence outcomes: logistic regression, naïve Bayes classifier, and two machine learning methods: random forest and XGBoost. The final random forest model predicted the bioequivalence outcomes in the test set with 84% accuracy using ten API features. The best model was re-qualified to predict bioequivalence risk, categorizing drugs into low, medium and high risk groups based on API properties. This developed methodology provides an objective, quantitative assessment of bioequivalence risk.
|