With the wide adoption of complex black-box models, instance-based post hoc explanation tools, such as SHAP and LIME became popular.
These tools produce explanations as contributions of features to a given prediction. The obtained explanations at the feature level are not necessarily understandable by human experts because of unclear connections with the background knowledge. We propose ReEx (Reasoning from Explanations), a method applicable to explanations generated by instance-level explainers. By using background knowledge in the form of ontologies, ReEx generalizes instance explanations with the least general generalization principle. The resulting symbolic descriptions are specific for individual classes and offer generalizations based on the explainer's output. The derived semantic explanations are potentially more informative, as they describe the key attributes in the context of background knowledge. ReEx is available as a python library and is compatible with explanation approaches such as SHAP and LIME. For the evaluation of ReEx's performance, we define measures of generalization and overlap of explanations. We conduct experiments on three textual datasets.
|