The Biggest Risk in Payments AI isn’t the Model, It’s the Data Underneath
The cards payments industry is moving fast on AI. Forecasting costs and revenue, billing management, performance tracking, process automation — the AI use cases are compelling, the vendors are persuasive, and the investment is following.
But there is an old principle that applies here with uncomfortable precision: garbage in, garbage out. Feed an AI model bad inputs and it will produce bad outputs. No model advancement changes that, and no engineering team can engineer around it. When a model is trained on data that is fragmented, inconsistent, or poorly understood, the outputs will be wrong in ways that are hard to detect. In financial services, that is the definition of dangerous.
The AI is not the hard part. The data layer is.
Advanced AI models have become increasingly accessible, even for secure, private data environments in which privately deployed models can be integrated and trained. The root challenge remains the source data; the very input that is needed for the training.
The organisations that will generate the most value from AI investment are not necessarily those that move fastest on model implementation. They are the ones that take the time to assess and solve the core data problem first — and understand why, in card payments specifically, that problem is harder to solve than it appears from the outside.
Two problems, not one
It helps to be precise about the root cause of payments data challenges.
The first problem is fragmentation. A single card transaction is recorded simultaneously across multiple systems that were built by different organisations, for different purposes, at different points in time. At a minimum, most banks and fintech data stacks are composed of the card scheme clearing records, settlement records, the payment processor’s transaction logs and their own internal ledgers. Further along the data flow, the same transaction event ripples into bank statements, fee billing files, supplier invoices, regulatory returns, and management reporting. None of these systems were built to agree with each other. Each system records the transaction from a different vantage point, using different identifiers, different timing conventions, and different levels of aggregation. In practice, these transaction records do not reconcile cleanly or automatically. The gaps accumulate, and the institutional knowledge required to navigate them typically lives in people rather than systems.
The second problem is that each individual dataset is complex, nonstandard and actively changing. Scheme data from Visa and Mastercard is the clearest example: proprietary formats differ between schemes, evolve continuously, and require deep specialist knowledge to interpret correctly.
Together, these two problems mean that the payments data environment for AI is not just messy – it is structurally resistant to being cleaned without card-specific expertise and systems in place to address the obstacles before data ingestion by any AI model.
Garbage in, garbage out. In card payments, the inputs are not just fragmented across systems that don’t agree — each of those systems is itself a specialist domain. That combination is what makes the payments data problem uniquely hard to solve.
So what does solving it actually require? Formatting and error-removal don't come close.
What ‘clean’ actually means in card payments
Clean payments data, in the context of an AI-ready infrastructure, is not simply data that has been formatted consistently or had obvious errors removed. It is data that has four specific characteristics:
1. Reconciled across systems. The clearing record, the settlement record, the processor log, and the internal ledger have been brought into agreement. Discrepancies have been identified, investigated, and resolved or documented. There are no unresolved gaps accumulating in the background.
2. Normalised across schemes. Visa and Mastercard outputs are mapped to a common structure. Fields that mean different things in different scheme formats have been standardised. The data can be read and analysed without requiring scheme-specific expertise at the point of consumption.
3. Allocated correctly. Scheme fees — which may now span dozens of distinct charge categories in a single billing file — are attributed accurately to the transactions and programmes that generated them. Misallocated fees produce systematically wrong cost models; correct allocation is the foundation of any meaningful profitability analysis.
4. Structured for use. The transformation and reconciliation work has been done upstream, allowing BI tools, forecasting models, and AI systems to consume the data directly. The coherent view exists as infrastructure, not as a one-off exercise repeated each reporting cycle.
With that foundation in place, the AI use cases that currently disappoint can begin to deliver.
Cambrist’s infrastructure layer between raw payments data and useful AI
Cambrist’s Data Analytics Platform sits between the raw data sources — scheme files from Visa and Mastercard, processor transaction records, internal ledger outputs — and the AI systems that banks and fintechs are building.
That is what payments data infrastructure actually means in practice. Not a dashboard. Not a reporting layer. A system that stands between raw scheme outputs and the analytical tools; absorbing the complexity of what schemes and processors produce and delivering something coherent on the other side.
When the foundation is right, the AI investment above it works. When it isn’t, no model compensates for it.
AI readiness is data readiness
The payments businesses that will generate the most value from AI are the ones that solve the data problem first. The return on AI investment in payments is not primarily a function of which model is chosen or which vendor is selected. It is a function of whether the data those models are trained on reflects a coherent, reconciled view of reality across all the systems that record it.
Most AI strategies in payments don’t account for this reality. Most will discover it the hard way — after the budget has been spent and the outputs disappoint.
Garbage in, garbage out. The question for every payments business investing in AI is the same: how confident are you in what you’re putting in?
That’s the problem Cambrist is built to solve. If you’re building an AI strategy for your payments business and want to understand what your data foundation actually looks like, we’d be glad to show you.