Agentic AI is gaining traction across the financial industry, but the industry’s biggest hurdle is no longer whether the models are powerful enough. An even more difficult question is whether banks, asset managers, and treasury desks have the infrastructure in place to delegate financial tasks to autonomous systems without losing control of money management, accountability, and compliance.
A Deloitte poll of more than 3,300 finance and accounting professionals highlights the difference. 80.5% said AI-powered tools such as agents and GenAI chatbots could become the norm within five years, but only 13.5% said their organizations were already using agent AI.
City Sky showed why the infrastructure debate matters
On April 22, Citi announced Citi Sky, an AI-powered wealth assistant built with Google Cloud and Google DeepMind technology. The tool was developed using Google’s Gemini Enterprise Agent Platform and will be gradually rolled out to Citigold customers in the US this summer.
This announcement gives a live banking example to the discussion of agent AI. Dipendra Malhotra, head of technology at City Wealth, cited memory as a central limitation of high-stakes advisory AI, asking how long a client can carry on a conversation before the system hallucinates.
Most agents rely on search expansion generation to expand memory through external databases. The context window is limited in the amount of information the agent can hold at any one time.
In financial advice, financial management, or portfolio execution, memory limits are more than a technical issue. This is an operational risk.
CoinFello co-founder MihnChi Park said the conditions for trusted delegation are simple: the agent can only act within the scope of the user’s instructions, the user can stop it, and the underlying assets are never transferred to a third party.
Ethereum drafts agent ID on-chain primitive
Ethereum proposal ERC-8004 introduces a system for agent identity, reputation, and verification. The draft standard specifies three registries: an identity registry, a reputation registry, and a verification registry.
These are intended to help autonomous agents prove themselves, build a record of their actions, and support verification by other market participants.
ERC-8183 takes a narrower route. It proposes a job escrow standard with attestation by the appraiser, where the client funds the job, the provider submits the work, and the appraiser completes or rejects the results.
The proposal does not provide arbitration or formal dispute resolution, but provides a framework for escrowed tasks and verifiable completion to an agent-based marketplace.
The arXiv paper “The Agent Economy: A Blockchain-Based Foundation for Autonomous AI Agents” maps a five-layer architecture for this shift, covering physical infrastructure, on-chain identity, cognitive tools, economic payments, and collective governance.
Structural vulnerabilities still exist in the reputation layer. Agents can generate activity at a speed and scale that humans cannot match, allowing trust signals to grow quickly.
That leaves financial institutions with difficult questions. If an agent has a good record, is that record evidence of trustworthiness or just evidence of repeated automated activity?
McKinsey targets 50% to 60% of banking operations
McKinsey estimates that 50% to 60% of a bank’s full-time equivalent is engaged in operations. Experts have warned of “pilot purgatory,” where agencies run narrow proofs of concept without rewiring their operating models.
As Cryptopolitan reported from the Hong Kong Web3 Festival, McKinsey predicted that the agent AI market will grow from $5.25 billion in 2024 to approximately $200 billion by 2034.
“Enterprises have no way to see, control, or audit what autonomous systems are doing with their money. Human oversight isn’t going away; it’s just moving up the stack,” said Porter Stowell, CEO of W3.io.
Who is responsible if an AI agent causes economic loss? Can the reputation of an AI agent be trusted? Who will manage these systems if they are deployed at scale? There are four regulatory frameworks that apply when agents act out of bounds.

