Rationalising Deep Learning Model Decisions
There is a clear trade-off between the performance of a machine learning model and its ability to produce explainable and interpretable predictions. The improved predictive accuracy of deep learning models has often been achieved through increased model complexity, inherently reducing their ability to explain their inner workings and mechanisms. As a result, their predictions are hard to interpret. Systems whose decisions cannot be well-interpreted are difficult to be trusted, especially in sectors such as healthcare or autonomous vehicles.
There is a need to rationalise a deep neural network model’s decision.
Solution #1: FLEX( Faithful Linguistic Explanations) is a framework developed to associate the features that are responsible for the decision, with words, and design a new decision-relevance metric that measures the faithfulness of a linguistic explanation to the model’s reasoning. FLEX generates post-hoc linguistic explanations to identify visual concepts that have contributed to the model decision and maps them to word phrases such that the generated explanations are intuitive, descriptive, and faithful. (FLEX: Faithful Linguistic Explanations for Neural Net Based Model Decisions )
Solution #2: A comprehensible CNN(CCNN) is a fully interpretable CNN that learns human-understandable features and explains its decision as a linear combination of features. An additional concept layer is introduced into a standard CNN-based architecture to guide the learning and a new training objective function is designed that considers concept uniqueness and mapping consistency together with classification accuracy. Experiment results have shown that CCNN can learn concepts that are consistent with human perception without compromising accuracy. (Comprehensible Convolutional Neural Networks via Guided Concept Learning )
A major insight for the interpretability of deep learning models in medical image classification applications is to utilise concept-based explanations to automatically augment the dataset with new images that can cover the under-represented regions to improve the model performance. A framework that uses the explanations generated by both interpretable classifiers and post-hoc explanations from black-box classifiers has been designed to identify high-quality samples that, when added to the training dataset, can improve the performance of the models compared to state-of-the-art augmentation strategies.
Interpretable and accurate deep learning models are critical in fields such as healthcare, autonomous vehicles, finance, regulatory compliance and manufacturing etc.