Explainable AI
When a fancy machine learning algorithm is deployed to assist a bank to determine a customer's credit-worthiness, or a certain drug is recommended to stabilize the condition of an inpatient in ICU, or even when an automated car avoids a pedestrian crossing in front of the vehicle, it is essential that how the model came to its decision can be understood.
Why?
If a customer is truly a credit risk, then the bank needs to be able to adequately explain to them why they won't be granting them a loan. In the case of dispensing a life-saving drug, the treating doctors need to be aware of all the factors that make this medicine the right choice. Why the car is able to stop in time, is important information that will facilitate improvements made to the programming of the automation code running self-driving cars.
In addition to the business insights that model explainability brings, it is also important to prevent bias that can creep into model decisions, and to improve the transparency of influential applications that have the potential to change people's lives. Explainability is essential in engendering trust in the various processes in which higher-order modeling solutions are used.
It is often said that machine learning is a bit of a black box, and this is true only too often, even for the data scientists who build the models. For example what each layer does in a multi-layered neural network that is designed to detect behavioral anomalies in a video stream, is not necessarily understood by the coder who modeled the application. As long as the detection is 'working', not too many questions are asked.
The problems with not understanding how the model works are multiple. It may mean it is harder to be used responsibly to not perpetuate societal inequities. It may be more difficult to help identify opportunities for improvement or innovation. It certainly means it will be more difficult to understand the reasons for a particular prediction, and it may also make debugging applications that use the machine learning models more difficult.
What are some ways AI and ML models can be explained?
The importance of each data feature used by some model types such as decision trees can be derived using feature importance techniques. This allows users of the models to determine the extent to which each data feature used to train the model contributed to its predictions.
Another method commonly used is Shapely values. Also known as SHAP values, this method assigns a value to each feature in the dataset for each prediction, which, similar to feature importance, also highlights how much each feature contributes to the predictions. It is, however, able to be used with a wider variety of model types than just decision trees.
Local Interpretable Model-Agnostic Explanations (or LIME) is another explainability method that involves creating a locally interpretable model that approximates the behavior of the original model for single predictions.
The creation of Model Surrogates is a method to create simpler, more interpretable models that also approximate the behavior of the original model. In this way the original model may be explained in a more understandable way.
Rule Extraction is another useful method used to explain how models operate. It involves extracting decision rules by examining the weights and hyperparameters associated with each feature in the trained model. These rules, which are often complex, can then be simplified by removing irrelevant features or combining similar rules.
These are just some of the methods used by data scientists to make their work more accessible to both other data scientists and the end users who rely on their models. Which method is selected will be up to the particular modeler and is likely dependent on their toolset knowledge and the model type.
If you would like to improve your understanding of how your models operate, or you have another machine learning challenge, please feel free to reach out to continue the conversation.