AI Explained Video Series: AI Concepts You Need to Understand

Published

September 3, 2020

Last Edited

April 15, 2025

Erika Renson

Fiddler AI

As businesses recognize the need for enhanced digital capabilities and build out more robust and advanced Data Science teams, AI is one of the areas being most heavily invested in. It is viewed within many organizations as a potential panacea: How will I forecast demand, make business recommendations, or combat customer churn? AI. How will I detect fraud, make a lending decision, or optimize costs? AI. But before putting AI into production, there are many concepts that need to be understood to ensure your AI is transparent, accountable, ethical, and reliable.

In our AI Explained video series, we’re diving into some of these concepts in bite-sized videos to help you understand the meaning behind the hype:

What are Shapley Values?
What are Integrated Gradients?
5 Types of Explanation Methods
Feature Importance
Explainable ML Monitoring

What are some of the other AI concepts you want to learn more about? If you have an idea for the topic for our next video, let us know. If you’d like to know when we add new topics to this series, subscribe here.

1. What are Shapley Values?

‍

The Shapley value is an attribution method from Cooperative Game Theory dating back to 1951. The basic concept is centered around how to fairly distribute surplus value across a coalition of all ‘players’ who contributed to the overall collective gain, assuming they all contributed at varying levels. The Shapley value was developed by Lloyd Shapley, who later won the Nobel Prize in Economics, and has been a popular tool in economics for decades. In this video, we discuss Shapley values as an attribution method:

1. What does Shapley values have to do with explaining ML models?
In data science, attribution allows you to attribute a model’s prediction on an input to features of the input. A key question in explaining predictions made by ML models is: “Why did the model make this prediction?” One way to answer this question is to quantify the importance of each input, also known as a feature, in the prediction. Shapley values are routinely applied to ML models in the lending industry, for example, to create adverse action notices, i.e., explanations for why a loan request was denied.

2. The marginal contribution challenge
To compute Shapley values we need to measure the marginal contribution of a player, i.e., a feature. This means we need to know the model’s prediction when a certain feature is absent. But how do we make a feature absent? The choice of distribution is an important design choice that will have implications on the attribution you compute. Our recent preprint explores this choice in detail and provides guidance on how to pick the distribution.

3. The computation cost challenge
There is a computation cost to going through all possible orderings. With n features there will be n! factorial orderings. With some algebra we can bring it down to 2^n model invocations. But this is still computationally expensive. To combat this, most approaches use some form of sampling to make the computation tractable. Note that sampling introduces uncertainty and so it is important to quantify the uncertainty via confidence intervals over the attributions.

While Shapley values is a helpful method for attribution in ML models, computing Shapley values only requires input-output access to the model and all we do is probe the model on a bunch of counterfactual inputs. In this sense, it is a black-box explanation method. The method places no constraints on the type of model used, and the model function need not be smooth, differentiable or even continuous.

2. What are Integrated Gradients?

‍

This video dives into another attribution method: Integrated Gradients. The Integrated Gradients method can be used to explain predictions made by deep neural networks (or any differentiable model for that matter). This method is centered around how to explain the relationship between a model's predictions and that model’s features. This method can be implemented in a few lines of code, and is much faster than Shapley values. The method serves as a popular tool for explaining image classification models in healthcare.

The Integrated Gradients method is simpler than Shapley values, as it is based on examining gradients at the input. This is one of the first attribution methods proposed for differentiable models, and dates back to at least 2010.

This video covers some key points about Integrated Gradients:

1. How the method applies to deep neural networks, and why we often find bizarre looking attributions. For instance, for an image model, we find that pixels that seem irrelevant get highlighted. Now why does that happen?

2. How more relevant attributions can be obtained by examining gradients across multiple counterfactual inputs that interpolate between the input at hand and a certain baseline. This motivates the design of Integrated Gradients.

3. An overview of baselines: the baseline is meant to be an information-less input, essentially an all-zero input. For an image it could be the all-black image.

4. The justification behind the Integrated Gradients method.

An important caveat for Integrated Gradients: unlike Shapley values, which place no restriction on the model function and only require blackbox access, vanilla Integrated Gradients requires differentiability and access to gradients. Consequently, the method cannot directly be applied to non-differentiable tree ensemble models (e.g., random forests, boosted trees). Interested readers can check out recent work on generalizing Integrated Gradients to non-differentiable models.

3. 5 Types of Explanation Methods

‍

Increasingly, we are seeing high-stake industries like insurance, healthcare, and criminal justice adopt complex, opaque ML models. As a result, the need for transparency in these models is also increasing. Depending on the type of model and the specific use-case, certain explanation methods can be more pertinent than others. This video covers an overview of five types of explanation methods:

1. Surrogate Model Based Explanations: This method essentially uses one model to explain another model. In this instance, you might have a complex blackbox model. In order to interpret it, you build a new more interpretable model, the ‘surrogate model,’ that mimics the original. This model will be used to predict the predictions of the original model.

2. Attribution Based Explanations: The idea behind this method is to explain a prediction by attributing it to features of the input. This is also known as the feature-importance method or the salience method because it highlights the salient reasons behind a particular prediction. There are different subdivisions within attribution-based methods, some of which are discussed above (Shapley values and Integrated Gradients).

3. Contrastive Explanations - This method is centered around highlighting the features that ought to be present for a certain prediction to occur, and those that ought to be absent. For example, for a loan application to be approved the applicant’s income ought to be above a certain level and the number of delinquencies ought to be below a certain level.

4. Counterfactual / Recourse Based Explanations: This method is getting more popular, and has a key difference from the first three discussed. All three previous methods are centered around explaining what factors went into a particular prediction. Counterfactual / recourse based explanations are asking a different question: how should the input change to achieve a different prediction? This is often used for instances that get unfavorable predictions where you would like the outcome to change. For example, for the loan that was denied, what is the path to recourse towards a favorable outcome?

5. Example based explanations: This method works to explain a prediction on an input by highlighting other inputs from similar datasets or examples. When purchasing a house, the best way to justify the price of that house is to compare it to the price of a house with similar attributes (in the same neighborhood, similar square footage, etc.).

4. Feature Importance

‍

In data science, we spend much of our time preparing and cleaning data. What little time is left is often spent inputting that data into existing models that we don’t understand deeply, hoping things come out ok on the other side. This video will discuss feature importance, which is a practice that can be used alongside analysis of existing models to improve performance and understanding of these models.

The basic idea here is that for each feature that is in a model or we plan to put into a model, we prevent the model from using that feature, and compute how much the absence of that feature decreases model accuracy. By comparing this measure across all features, it is possible to determine relative feature importance within a model. This practice does not replace good analysis, but if done properly, it can be used to accelerate and enrich exploratory data analysis and feature selection.

This video discusses different types of feature importance, how we can use them, and the pros and cons of each:

1. Core techniques: These are the techniques most commonly used for determining feature importance.

- Permutation Feature Importance: This technique involves preventing a model from using a feature by scrambling that feature (aka random ablation). By scrambling one feature at a time, it prevents the model from using that feature effectively. This technique is fast and works with any model.

- Leave-One-Out Retraining: This is a slower option than permutation. HEre, rather than permuting your inputs, you keep your inputs the same and extensively retrain your model on every possible subset that leaves one feature out. This can be quite slow, especially if your model has a lot of features. This technique is directly tied to the goal of optimizing model performance and can help pick out when it’s ok to drop a correlated feature.

- Built-In Model-Specific Feature Importance Measures: This is the most popular method, as tree ensemble methods like Random Forests and Gradient Boosted Trees offer built-in feature importance measurements. This technique takes no additional computing costs and is built in, so it should be used whenever available, with two caveats: 1) this technique is biased because it measures how the models’ in-sample fit improves with each feature, rather than its out-of-sample performance 2) it artificially inflates importance of numerical features and high-cardinality categorical features. Due to these drawbacks, we encourage you to invest a bit more time to add in permutation or running more retrainings when employing this technique.

2. Variations on core techniques: These techniques are less common, but we find them to be important in the context of understanding potential limitations or debugging a model (while core techniques tend to focus on increasing model performance).

- Slicing your dataset : This involves taking a specific and meaningful subset of data and using that slice for your performance metric commutation. Use reference data from your overall data set. By comparing feature importance of this slice of data to your overall dataset, you can gain a better understanding of how things might differ for certain subsets of your data. This gives you a more nuanced understanding of your model in the context of its application.

- Prediction sensitivity after permutations: This technique looks at prediction sensitivity rather than performance sensitivity, helping you to understand how a model is sensitive to specific inputs even if its correlated change in output does not affect model performance. While a given input may not affect a model’s performance score, it is helpful to understand how that input will impact the model’s behavior overall.

5. Explainable ML Monitoring

‍

It’s no secret that deployed AI systems are error-prone, and as AI is integrated into more and more businesses, pain points are becoming more predictable. This video introduces Explainable ML Monitoring, which extends traditional monitoring to provide deep model insights with actionable steps. With monitoring, users can understand the problem drivers, root cause issues, and analyze the model to prevent a repeat, saving considerable time and increasing trust in AI in production. The video covers an overview of some of the risks of AI, the need for explainable monitoring, and what exactly we mean when we talk about it:

1. Why do we need explainable monitoring?
A lot can go wrong with deployed AI. The data you see in production is very rarely the same as the dataset that a model was trained on, resulting in data drift model decay, bias built into models, or data pipeline issues. The opaque nature of AI models creates confusion and doubt - if you know your model is providing low-quality predictions but don’t know which inputs are causing the issues, it can be close to impossible for you to fix. Poor predictions can cause doubts at every level of your business - business owners, customers, customer support, IT and operations, developers, and internal and external regulators.

2. What is explainable AI?
Explainable AI refers to the process by which the outputs (decisions) of an AI model are explained in the terms of its inputs (data). Data goes into the model and it comes out with a prediction, Explainable AI adds a feedback loop to the process, enabling you to explain why the model behaved in the way it did for that given input. Explainable AI helps to provide clear and transparent decisions and build trust in the outcomes. When AI with explainability is in production, you have the ability to monitor data once it is fed into the model, helping to ensure fairness and high performance. Actionable insights allow you to drive improvements in your models.

3. What is the state of monitoring?
With the advent of AI, a new monitoring paradigm has surfaced. In the past, we had business metrics monitoring, where business users would monitor business metrics.. Then engineering and DevOps monitoring provided the ability to monitor how well your servers and entire IT infrastructure were behaving. With ML, there is a new kind of monitoring that is required - you need to be able to track ML model health and performance with ML-specific metrics that are not supported by historical capabilities.

4. What is an explainable monitoring solution?
To successfully understand your AI in production, you must have a solution with the ability monitor and drill down into key areas, allowing you to detect and address performance degradation, inadvertent bias, data quality issues, undetected issues, and alternative indicators of performance, providing black-box transparency to your models.

If you’d like to know when we add new topics to this series, subscribe here

What are some of the other AI concepts you want to learn more about? If you have an idea for the topic for our next video, let us know.

AI Explained Video Series: The AI Concepts You Need to Understand

1. What are Shapley Values?

2. What are Integrated Gradients?

3. 5 Types of Explanation Methods

4. Feature Importance

5. Explainable ML Monitoring