Building trust in autonomous AI

0
3
Building trust in autonomous AI


After five years of AI growth in India, leaders have learned that scaling up AI is anything but simple – it is complex, difficult and unfeasible if controls are weak. Behind the headlines of successful models lies a harsh truth: success depends less on algorithms and more on governance, trust and operational discipline. Although these systems promise speed and scalability, they also introduce new risks: opaque decision-making, ethical blindness, and accountability gaps. The question is not whether businesses should adopt agentic AI – the question is how they can do so responsibly while building trust at every level.

artificial intelligence

A recent analysis found that when global AI models were tested with India-specific signals, they sometimes reflected unintended cultural or contextual biases. Similarly, evaluation using the Indian-BHED dataset has shown that large language models (LLMs) can sometimes generate responses that reflect common patterns or assumptions present in local data. Both of these examples underscore the importance of rigorous local auditing and optimization to ensure that AI systems work fairly and accurately in diverse environments.

The rapid rise of agentic AI is also introducing new threats that businesses cannot ignore.

First, model-drift and unexpected behavior are becoming more common as autonomous systems update themselves from live data streams. A June 2024 incident at a major Indian payments platform resulted in a temporary halt when its self-adaptive fraud-detection engine began flagging legitimate transactions from small merchants.2 billion sales.

Second, privacy-by-design is being tested in real time. During an early 2024 public beta of a new generative-AI model, the system inadvertently cached snippets of users’ health-related questions, leading to an investigation by the European Data Protection Board and the rapid rollout of an automated “audit-log” feature.

Third, prejudice is coming to the fore in all areas beyond language. A 2024 audit of a regional insurance-tech startup revealed that its AI-powered claims-approval workflow rejected ~68% of policies from tier-2 and tier-3 districts, a pattern revealed by training data that over-represents urban loss proportions.

Fourth, the weaponization of productive equipment is moving from experimental laboratories to everyday attacks. In early 2024, a deep fraud scam targeted a Hong Kong-based multinational company. the organization suffered major lossesFraudsters defrauded ₹25.6 million after impersonating senior executives and using AI-generated voices and videos to transfer funds to employees.

Finally, regulatory pressure is tightening. RBI’s digital lending directions now require an explicit human-in-the-loop checkpoint for any AI-driven credit decision, and the US SEC has started flagging AI-generated disclosures that lack traceability.

Together, these examples show that autonomy without transparent guardrails can quickly translate into operational, reputational, and compliance consequences – making a “glass-box” approach to AI governance essential for sustainable development.

Designing trustworthy, agentic AI relies on three sharp principles that leading companies are already turning into practice.

First, transparent, auditable AI starts with a decision-by-decision log that captures input data, model versions, and a concise business logic for each inference. Hallucinations – instances where the model confidently generates false information – remain the biggest hurdle in moving from flashy demos to trustworthy corporate tools. To overcome this, engineers now use three-layer protection: grounding the model with recovery-enhanced generation, guiding its logic through structured signals and temperature control, and controlling the output with guardrails and dedicated “critic” agents.

* Grounding: Recovery-Augmented Generation (RAG): The most effective way to prevent hallucinations is to prevent AI from relying on its memory. Instead of letting the AI ​​“guess” based on its training data, you give it a “closed book” test. When a user asks a question, the system first searches your private, verified documents (PDF, wiki, database) for the answer. This forces the AI ​​to act as a librarian rather than a storyteller.

* Guidance: Logic Framework: Sometimes AI hallucinates because it tries to arrive at an answer too fast. You can use specific signaling techniques to slow it down, like,

* Chain of Thought (COT): You force the AI ​​to explain the logic step-by-step before giving a final answer. If the logic is flawed, the error is easy to spot.

* temperature control: In the technical settings, you set “Temperature” to 0. This makes the AI ​​less “creative” and more likely to give the same factual response every time. One such example is Deloitte’s Tax Pragya which has turned its AI-based search and summary platform into a “glass-box” for India’s tax professionals. It has been trained on over 1.2 million tax cases and over 5,000 Deloitte technical papers, solutions and proprietary expert insights to deliver virtually zero hallucinations.

* Governing: Guardrails and “Guardian Agents” – Modern systems use a second AI to “police” the first one.

* Nemo Guardrail/Llama Guard: These are separate, smaller models that sit between the AI ​​and the user. They scan the AI’s output for toxic language, policy violations, or “hallucination patterns” before the user sees the text.

* “Critic” agent: In an agentic workflow, you often have two agents, one generating the AI ​​response (the generator), while the other does the fact-checking (the critic) against trusted data sources.

Second, human-in-the-loop safeguards serve as an important safety net for high-risk decisions. When the AI’s confidence score drops below a set threshold, the process stops and the decision is passed to an expert. For example, RBI-compliant fintechs send low-confidence credit scores to senior loan officers, while health tech devices require radiologists’ sign-off before critical imaging results reach patients. Every intervention is logged with AI output, creating a complete audit trail that ensures compliance without sacrificing speed and scalability.

Continuous risk monitoring and flow control Act as an early warning system for autonomous AI. Real-time dashboards track key metrics against agreed baselines, while automated alerts and rollbacks are activated when thresholds are breached – preventing large-scale errors. Each alert includes root-cause data, helping analysts quickly identify issues such as feature changes or changes in user behavior and restore stability.

One e-commerce giant’s fraud-detection engine noticed a sudden increase in legitimate orders being flagged as fraudulent, triggered an alert within minutes and automatically rolled back the model to a previous stable version. Analysts then identified and fixed a misaligned feature, restoring normal transaction flow and protecting merchants’ sales.

All these safeguards are overseen by an AI Ethics BoardA board that reviews each autonomous system before it is launched, sets clear policy standards, runs regular bias and risk audits, and updates the governance framework as the model evolves. By combining transparent logging, human-in-the-loop checks, and a dedicated ethics committee, organizations can ensure robust oversight.

One such exemplary model is that of the Australian Defense Force, where the Ethics Board actively collaborates with technical and operational teams, conducts rigorous pre-deployment reviews, and conducts ongoing monitoring to ensure that AI systems remain transparent, accountable, and aligned with organizational values.

Transforming autonomous AI from a flashy demo into a trusted corporate tool requires more than clever gestures or big models. This demands a disciplined, layered defense: grounding outputs in verified data, guiding reasoning with structured prompts, and controlling behavior through guardrails and independent critics. When these technical measures are combined with business-centric practices – audit-ready logs, human checkpoints and real-time drift detection – organizations can reap the efficiency benefits of agentic AI without putting themselves at ethical, legal or reputational risk. For Indian enterprises standing at the intersection of innovation and regulation, the way forward is clear: adopt a glass-box mentality.

This article is written by Sudipta Veerapaneni, Partner and Chief Innovation Officer, Deloitte India.


LEAVE A REPLY

Please enter your comment!
Please enter your name here