Decisions of complex machine learning algorithms such as random forest and neural networks are difficult to explain. This problem can be addressed with perturbation-based algorithms, such as SHAP, which assigns credit for prediction to individual attribute values.
Our goal was to check if the output of SHAP matches the background knowledge. We used the XGBoost model on several data sets, where attributes are proteins, and explained the model with SHAP algorithm. We checked if there are known biological interactions between proteins, which SHAP marks as important. The method could turn SHAP into interaction discovery algorithm. Obtained numbers of interactions differ based on the chosen data set and knowledge base. Our research hints at potential usefulness of explanation algorithm for finding interactions.
|