Every week, another asset manager announces an AI initiative. Board decks are full of it. Vendor pitches are relentless. The pressure to "do something with AI" has never been higher.
But the uncomfortable truth is this: the vast majority of AI projects in financial services fail. Not because the models are wrong. Because the data underneath them is broken.
I’ve seen this pattern repeat across every engagement we’ve done at Datafabric. The firms that succeed with AI are never the ones that started with a model. They started with their data.
The AI Hype vs Reality Gap in Asset Management
According to McKinsey, AI-enabled workflow reimagination could allow asset managers to capture 25–40% of their total cost base in efficiencies. That figure has been cited in countless boardrooms. What gets cited less often is Gartner’s prediction that through 2026, organisations will abandon 60% of AI projects that lack AI-ready data.
The gap between promise and delivery is not a technology problem. It is a data problem.
Asset managers operate in one of the most data-intensive corners of financial services. On any given day, a mid-sized fund manager with $3 billion in funds under administration (FUA) might pull data from a custodian, a fund administrator, three platform providers, a CRM, two market data vendors, a registry, and internal spreadsheets. That is nine or more sources before anyone has opened a dashboard.
When an AI model is built on top of fragmented, inconsistent, or stale data, it does not produce insights. It produces confident-sounding nonsense — and in a regulated industry, that is worse than having no AI at all.
Why Most AI Projects Fail: They Skip the Data Layer
There is a predictable sequence to how AI initiatives go wrong in asset management:
- The executive mandate. The board or CEO says "we need an AI strategy."
- The vendor selection. A tool is purchased — often a general-purpose LLM or analytics platform.
- The integration attempt. The team tries to connect the tool to internal data. They discover that FUA figures in the custodian do not match the CRM. Platform flow data arrives in four different formats. Client names are spelled three different ways.
- The workaround. Someone builds a manual data pipeline in Excel or Python. It works for the demo. It breaks the following week.
- The stall. Six months and $200,000 later, the project is quietly shelved.
The mistake is always the same: treating AI as the starting point rather than the outcome of a data capability.
The 6 Dimensions of Data Quality
Data quality isn’t binary. At Datafabric we assess data across six dimensions, each of which needs to meet a minimum threshold before AI can be applied reliably.
Completeness
Are all required fields populated? A fund record missing its APIR code is incomplete. A client record without an adviser relationship is incomplete. Across the asset managers we work with, completeness gaps average 12–18% on initial assessment.
Accuracy
Does the data reflect reality? If your CRM says a client has $50 million FUA but the custodian says $47 million, which is correct? Accuracy errors compound when data flows downstream into reports, analytics, and AI outputs.
Consistency
Is the same entity represented the same way across systems? “ANZ,” “Australia and New Zealand Banking Group,” and “ANZ Banking Grp” might all refer to the same organisation — but a machine will treat them as three separate entities.
Timeliness
How quickly does data arrive after the event it represents? If platform flow data arrives with a two-week lag, any analytics built on it are already stale. For distribution teams making decisions about where to travel next week, stale data is useless data. This is why we process platform files as soon as they’re shared with us, rather than waiting until the end of the month.
Uniqueness
Are there duplicate records? Duplicate client records are one of the most common — and most damaging — data quality issues we encounter. One firm we onboarded had 340 duplicate adviser records in their CRM, inflating their prospect count by 22%.
Freshness
When was the data last updated? A record that was accurate six months ago may not be accurate today. Freshness is particularly critical for market data, flow data, and compliance records.
What “Data Centralisation” Actually Means for an Asset Manager
When we talk about data centralisation, we are not talking about ripping out your existing systems. We are not suggesting you replace your CRM, change your custodian, or migrate off your fund administrator.
Data centralisation means creating a single, trusted layer that sits on top of your existing systems and harmonises the data they produce. Think of it as a translation layer: each source speaks its own language, and the centralised layer ensures they all say the same thing.
For a growing asset manager with $500 million to $10 billion in FUA, this typically means connecting 15 to 20 data sources — custodians, fund administrators, platform providers, CRMs, market data feeds, registries, and internal spreadsheets — into a single environment where data is cleaned, matched, deduplicated, and quality-scored every day.
How The Foundry Addresses This
The Foundry is Datafabric’s data centralisation and quality engine. It connects to 18+ source types through 47 pre-built integrations, ingests data on a daily cycle, and applies trust metrics across all six quality dimensions.
Here is what that looks like in practice:
- Ingestion. Data is pulled from each source automatically. No manual uploads. No CSV exports. The Foundry connects directly to custodian portals, platform APIs, CRM systems, and file-based feeds.
- Harmonisation. Entity matching resolves the “ANZ vs ANZ Banking Grp” problem. Fields are standardised. Units are normalised. Dates are aligned to a single timezone.
- Quality scoring. Every record receives a trust score based on the six dimensions above. Users can see at a glance which data is reliable and which needs attention.
- Alerting. When quality drops below a threshold — a feed fails, a field goes blank, a duplicate appears — the system flags it immediately rather than letting it propagate downstream.
The result is not just clean data. It is data that the organisation can trust, and that AI models can rely on.
The Compounding Effect: Trusted Data Unlocks Everything Else
This is where the investment in data quality pays off many times over. Once The Foundry has established a trusted data layer, every other capability on the Datafabric platform benefits:
- Compass (analytics and reports) draws from the same trusted source, so dashboards and board packs reflect reality rather than a best guess.
- CoPilot (enterprise AI) can answer questions like “What were our net flows by platform last quarter?” with confidence, because it is querying verified data rather than a patchwork of spreadsheets.
- OpsFlow (operations and compliance) can automate workflows knowing that the underlying data has been quality-checked and is audit-ready.
Without the data layer, each of these capabilities would require its own data pipeline, its own reconciliation process, and its own workarounds. With it, they share a single source of truth.
What This Looks Like in Practice
One of our clients, a specialist asset manager, came to us with data spread across multiple systems. Their team was manually reconciling platform flow data, pulling reports from different sources and cross-checking them in Excel.
We deployed The Foundry and connected 27 integrations. Each data source is processed within an hour of ingestion. The first quality assessment uncovered significant gaps:
- Adviser count accuracy improved from 74% to 95% across approximately 750 adviser records
- Funds under management accuracy improved from 69% to 94%
- Completeness gaps and data lags that nobody had previously flagged
That is the real return on data quality: not a dashboard, but trusted data you can actually act on.
Getting Started: It Is Faster Than You Think
The most common objection I hear is “we know our data is a mess, but fixing it will take forever.” It doesn’t have to. The Foundry delivers a baseline data layer in weeks, not months. Our implementation model, what we call Service as Software, means Datafabric handles the integration, configuration, and ongoing operation. Your team doesn’t need to hire a data engineer or manage a pipeline.
If you’re evaluating AI for your asset management business, start with the data. Everything else follows.