Building Trust With AI in the Financial Services Industry

Published

May 11, 2021

Last Edited

July 15, 2025

Henry Lim

Former

Fiddler AI

We’re in the midst of a revolution where every company, big or small, is trying to incorporate AI decision-making into their product and business workflows. However, operationalizing AI in a responsible and trustworthy manner is one of the hardest challenges out there, especially for banks and financial institutions. Krishna Gade, CEO of Fiddler, recently discussed these challenges and how Fiddler is set up to solve them, as a guest on the FinRegLab podcast with FinRegLab CEO Melissa Koide. You can listen to the podcast recording here and read a condensed version of the discussion below.

Introducing FinRegLab and Fiddler

FinRegLab and a team of researchers from the Stanford Graduate School of Business are collaborating on an evaluation of machine learning in credit underwriting. This research project is meant to address questions about the transparency and fairness of machine learning tools in the financial services industry. As part of their research, FinRegLab has engaged private sector firms that have built machine learning explainability tools and techniques.

At Fiddler, we’ve been excited to work with FinTechLab on this project. Fiddler is an explainable AI platform that helps companies build trustworthy AI. Our mission is to enable every enterprise in the world to build trust with AI and incorporate it into their business workflows in a safe and responsible manner.

The challenges of operationalizing AI in the financial industry

There are a lot of benefits to using AI in production to decrease manual work and unlock ROI. But especially for use cases that impact people’s livelihoods, the risks are high, both for the business’s reputation and for society at large. When a bank wants to adopt AI, for example for credit underwriting or fraud detection, they encounter four major problems:

1. Lack of transparency into why the model made a decision

Many banks have model risk management teams whose job is to validate each model and make sure its decisions are explainable to business stakeholders regulators. Many years ago, when these teams were created, the traditional statistical models could be manually tested and understood by a human evaluator. That’s no longer true.

When a modern deep learning model takes a set of inputs and generates an output, the underlying structure of how it arrives at that prediction is so complex that humans just can’t understand it. It’s a black box. The data science practitioner, the business stakeholder, and the regulator will all be in the dark as to why the model disapproved a loan or marked a transaction as fraudulent.

2. Lack of visibility into how the models are performing in production

Monitoring models for changes in performance is a very important issue for model risk management teams as well. Unlike traditional statistical models, AI models can suffer from data drift in production. What this means is that because models are trained on historical data, when the live data changes, the model may not continue to work as expected. We’ve seen this happen dramatically because of COVID-19, where, for example, there has been a huge change in the distribution of loan applicants.

3. Potential bias that the model could be generating for end users

Financial institutions that want to operationalize machine learning must have a gameplan for dealing with bias in their systems. No one wants to have an incident like the Apple Card, which faced major allegations of gender bias shortly after launch. But how do you actually validate a model for bias? This is a very hard problem, and no universal metrics for quantifying bias currently exist.

4. Potential non-compliance of models

The financial services industry is under intense regulatory pressure. Even if institutions could see millions of dollars of ROI for launching new, complex AI models, these ideas often remain stuck in the lab because they can’t get past compliance teams. This happens, rightfully, for all of the reasons above—the model can’t be explained, properly validated, safeguarded from bias, and monitored.

We often wish that tech companies would implement more of the rigor that banks have for validating their models. On the other hand, we hope that banks can adopt more of the tools that tech companies use to help them explain and monitor AI so that they can successfully launch more models into production.

How Fiddler works

The inspiration for Fiddler came from Krishna’s work at Facebook, where he led a team to develop infrastructure for explaining Newsfeed rankings and predictions in a human-readable manner. Krishna started Fiddler to create a platform that any company could use to productionize AI in a trustworthy, responsible way.

Fiddler is working with two of the largest banks in the US and helping them implement a centralized explainability and monitoring platform for their compliance programs. This means model risk management teams can assess risks before launch and have continuous visibility into model performance in production. Furthermore, Fiddler is a tool for the entire organization to use, providing a shared, transparent view for everyone from data scientists to business and compliance stakeholders.

Fiddler explains models

Fiddler is a pluggable, general purpose platform that makes models transparent. This is essential for building trust with AI, and lets teams understand where bias might exist and how models can be improved.

Teams can import a variety of sophisticated machine learning models into Fiddler, whether they were built in-house or adopted from a vendor. If you wanted to understand why your loan model decided not to approve an applicant, Fiddler could provide an explanation using accessible language: maybe the loan amount was too high, or the applicant’s FICO score was too low.

Depending upon the model that our customer is trying to explain, we offer multiple techniques they can choose from. Fiddler's explanation algorithms rely on a concept from game theory called Shapley values, invented by the Nobel Prize-winning economist Lloyd Shapley. In essence, Shapley values probe the model with “what if” questions: This person was approved for a loan with a salary of $100K—what if their salary was 80K, would they be approved? When there are huge numbers of input possibilities to consider, like in text or image processing, we use an optimization called integrated gradients.

Fiddler monitors models in production

Fiddler continuously compares your current model performance with how it performed on the training set, so that you know if there are major changes happening in production. Our users can configure alerts when the drift goes beyond a certain threshold (like 10% or 20%). And they can clearly pinpoint the data that changed, for example, if there was a shift in applicants’ debt to income ratio between training and production. This helps teams make a decision regarding how to retrain the model and/or apply safeguards to their business logic.

Conclusion

Our mission is to help companies that are on the path of operationalizing AI for their real business processes. Too often, AI ideas fail to make it out of the lab. We’re here to help teams obtain the value of AI in a responsible manner by continuously monitoring and explaining their models across the organization. Let us know how we can be a part of your team’s AI journey. Contact us to talk to a Fiddler expert!