To Err is AI: Why and How to Understand the Decisions AI Makes

Zurana Mehrin Ruhi
Data Scientist

My interest in Explainable AI (XAI) started with a simple yet striking blog post titled 'Predictive policing is still racist—whatever data it uses'. While AI errors aren't new, what's truly dangerous is when bias and harm are embedded so deeply in training data that they slip past the engineers who built the model.

Unlike the standalone models of the past, today's foundation models are incredibly complex and trained on vast, often uncurated datasets. They're powerful and perform impressively across domains, but that scale and complexity make them prone to subtle, undetected biases. Even when AI doesn't technically 'make a mistake,' hidden prejudices can still influence decisions—and they often go unnoticed until someone calls them out in a blog post.

As AI becomes integrated into medicine, finance, and media, understanding why AI makes decisions isn't just useful, it's critical. IBM has documented real life examples where AI bias affects decision-making and disproportionately targets. Amazon scrapped their recruiting tool as it showed bias. These cases underscore the urgent need to investigate AI products before deployment, preventing losses that could be avoided through robust explainability and AI governance. This is where Explainable AI (XAI) steps in to illuminate the 'black box' of modern AI.

XAI and the Need for Transparency

Under the EU AI Act, transparency in AI means:

AI systems are developed and used in a way that allows appropriate traceability and explainability, while making humans aware that they communicate or interact with an AI system, as well as duly informing deployers of the capabilities and limitations of that AI system.

Despite legal and ethical pressure for explainability, most industries still don't fully integrate XAI techniques into their AI workflows. Medical AI and finance see higher XAI adoption because misinterpretations can have life-and-death consequences. In fields like media, content curation, and recommendation systems, XAI is often overlooked—despite its potential to improve fairness and trust.

Consider a news recommendation model. If an AI system determines which news articles appear on a homepage, it should mimic the decision-making patterns of a human editor. But AI often learns shortcuts, leading to biased or click-driven content selection which may go against the public values of the platform. Using a biased model for editing or summarizing may lead to confirmation bias and amplify social and cultural prejudices. Integrating XAI methods into these models could reveal the key factors influencing AI-driven editorial choices, allowing for human oversight and ethical alignment.

At Sparks, identifying and addressing these concerns begins with transparency and a set of policies that allow us to manage and monitor our AI solutions. These guidelines also enable us to ensure we deliver products aligned with ZDF's values as a public broadcasting service.

Beyond Legal Compliance: How Sparks Developers use XAI

While non-technical users may rely on XAI to gain insights into AI's reasoning, our ML engineers and data scientists explore these methods as a debugging tool to perform safety checks, detect biases, assess risks, and identify performance bottlenecks. Accuracy matters, but so does our responsibility to deliver products that won't harm users or treat anyone unfairly.

Sparks follows the latest academic research, seeking the most feasible solutions to integrate into our AI solutions. We attend workshops and technical events that showcase the latest developments in industry. Although no specific open-source XAI tool exists for LLMs currently, we focus on understanding technical possibilities and developing in-house solutions as needed.

But How Exactly? Foundation Models are Complex Enough Already!

Large language models take complexity to another level: unlike traditional models, LLMs have billions of parameters, are trained on vast and often unverified web data, require significant computational power, and the list goes on.

Many companies are integrating aspects of Responsible AI and AI Governance into their AI workflows, the Responsible AI team at Google and their XAI tools as prime examples. Academic research offers various XAI techniques for LLMs, but real-world deployment remains largely theoretical. Most deployed LLM models rarely integrate such techniques, meaning much of the advancement of XAI stays in research labs.

At sparks, we actively integrate these methods and train our employees to stay at the forefront of AI advancements through hands-on workshops. Let's explore one promising method we recently investigated.

Concept Bottleneck Models: Teaching AI our way!

AI models are heavily prone to learn shortcuts, no matter how powerful they are. What if we showed them shortcuts that actually make sense to us?

Concept Bottleneck Models (CBMs) offer a promising XAI approach. Unlike traditional deep learning methods that try to explain model decision through raw features like pixels or token embeddings, CBMs introduce an intermediate layer where the model must explicitly predict human-understandable concepts before making final decision.

For instance, instead of a vision model simply predicting 'bird', a CBM would first identify attributes like wings, beak, tail, and feathers—mimicking how a human would logically arrive at the same conclusion. CBMs have been particularly effective in vision models, moving beyond ambiguous heatmaps that non-experts struggle to interpret.

By enforcing concept-based learning, CBMs make AI decisions more interpretable, verifiable, and debuggable. Researchers have also extended this approach to Large Language Models (LLMs), integrating language-guided concept bottlenecks to improve transparency.

Language In a Bottle - Concept Bottleneck Model Diagram
Figure: Language In a Bottle

While defining meaningful concept spaces can be time consuming, researchers have successfully used LLMs to generate candidate concepts and developed various methods to optimize concept selection. CBMs can also enable concept 'unlearning' in LLMs to remove biases and guide the model towards a more accurate text understanding.

These methods can derive post-hoc explanations or create inherently interpretable AI models, aligning machine reasoning more closely with human intuition. However, like other XAI methods, concept-based reasoning may sometimes degrade model performance—a trade-off worth considering.

Final Thoughts

As AI is advancing at an unprecedented pace, understanding why it makes decisions is just as important as the decisions themselves!

Explainable AI isn't just about compliance; it's about trust, accountability, and responsible deployment of technology that is shaping our future. Yet, the gap between research and real-world deployment remains wide. At Sparks, we are actively working to bridge this gap, committed to ethical, transparent and unbiased decision-making. By proactively integrating these methods, we aim to ensure ZDF not only meets regulatory and ethical standards but continues maintaining the public trust that defines our mission.

"Note: Some of the visuals in this blog post were created using AI technology."

AI with Purpose. Innovation with Integrity.
ZDF Sparks GmbH
Büro: Hausvogteiplatz 3-4, 10117 Berlin
Kontaktiere Uns: