No Magic Wand: The Real Data Problem Behind Enterprise AI

Written by ETR Research | Apr 1, 2026 7:45:00 PM

Enterprise AI adoption is accelerating, but the data holding it back hasn't changed. Technology leaders are confronting a hard truth: the same governance gaps, fragmented architectures, and integration headaches that once derailed self-service BI are now the primary obstacles standing between their organizations and scalable AI. The tools have evolved. The underlying data problems have not.

An ETR Insights panel brought together three senior technology leaders: an AI Strategy and Engineering Lead at a large insurance firm, a Head of Delivery Assurance at a large healthcare business services firm, and a Head of Data and Analytics at a large financial services firm. They gathered to discuss what it actually takes to build a data foundation capable of supporting enterprise AI. What emerged was a candid, ground-level view of the gap between AI ambition and data reality.

The Ghost of BI Past Is Haunting Your AI Roadmap

If your organization struggled with self-service BI, prepare for déjà vu. The dynamics that complicated analytics a decade ago, namely business-unit fragmentation, inconsistent data quality, and incompatible systems, are the same forces slowing AI today.

One panelist, who oversees a hyperscale data-fabric platform, put it plainly: data centralization is a goal that never fully arrives. "Those data formations, those data repos obviously exist and grow, but data will never be fully centralized. The term 'data silo' has a negative connotation, but the consensus of reality that we accepted is that the data will live in disparate places, with a very distributed footprint." His organization has responded not by chasing centralization, but by investing in a semantic layer that bridges Teradata, Cloudera, Databricks, and Snowflake into a coherent operational picture.

The healthcare panelist echoed this, describing a sprawling multi-platform environment born from a "horses for courses" philosophy: one business unit anchored to its traditional data warehouse, another standardized on Snowflake, a third evaluating Microsoft Fabric as a potential unifying layer. "Data is our lifeblood, so we have everything from data lakes, to data warehouses, to different kinds of data stores, etc." he noted. "Because we believe in best-of-breed, each of our business units does its own thing."

The data supports what these executives are experiencing. According to our September 2025 AI Product Series survey, data quality and availability issues have risen to become tied for the top technical or organizational challenge companies face when developing their own AI applications, a notable increase since the March 2025 survey, and now on par with lack of skilled personnel.

According to the September 2025 AI Product Series survey, data quality and availability issues are now tied for the top technical or organizational challenge that companies face in developing their own AI applications, growing since the March 2025 survey and now on par with lack of skilled personnel.

The Biggest Barrier to AI Isn't Technical. It's Conceptual.

Perhaps the most clarifying insight from the panel came from the financial services executive, who reframed the AI readiness conversation entirely. "It's not a magic wand where you bring in one solution and everything gets automated end to end," he said. "It's about building the right strategy, having awareness of what AI means to us, and how we can reach the north star of our AI roadmap. It comes in smaller phases: awareness, defining the right use cases, and keeping humans in the loop when it comes to AI."

This framing matters because the alternative, treating AI as a technology procurement problem, produces exactly the wrong outcomes. Unrealistic expectations lead to misplaced blame on data teams when AI pilots underperform. "There will never be perfect data in any organization," the same panelist noted, "but you need to find out those pockets where the data quality issues are minimal, where the integrations are working fine, and how we can bring AI on top of that."

The playbook that is working: phased rollouts that begin with productivity augmentation before moving toward greater automation. "Primarily, we were increasing the productivity of our existing associates, how they can improve the productivity, and then look for later how we can completely remove the human-in-loop need." In sectors with entrenched competition, AI is also serving as a product differentiator, particularly in document intelligence, OCR processing, and sentiment analysis at scale, though compliance requirements impose real constraints. NIST standards, EU AI Act mandates, and global regulatory frameworks shape every architectural decision, from model design to data storage.

Legacy Governance Vendors Are Losing the Battle Against the Cloud

One of the more pointed conversations from the panel concerned the long-term viability of standalone data governance and quality vendors. Collibra, Informatica, Alteryx, and Alation are now losing ground as hyperscalers bundle governance, lineage, and data management capabilities directly into their cloud ecosystems.

"Those [legacy] architectures have kind of plateaued," said one executive. Another described a more structural shift: the emergence of AI has introduced entirely new data types, including vector databases, unstructured content pipelines, and enterprise content management, that legacy vendors were simply not built to handle. "In the context of AI, content management also becomes data and has to be treated as data, because it goes into foundational models."

The Technology Spending Intentions Survey (TSIS) puts hard numbers to this trend. From January 2024 to October 2025, Alation's Net Score fell from 41% to 12%; Alteryx's Net Score fell from 9% to -22%; Collibra's Net Score fell from 20% to 4%; and Informatica's Net Score fell from -9% to -16%. These are not minor fluctuations. They represent a meaningful contraction in enterprise confidence in these platforms.

One panelist offered a more optimistic counter-view: advanced AI systems may eventually offset data quality problems by building ontologies and knowledge graphs capable of interpreting inconsistent inputs. He cited Palantir as an example of a provider that "can actually make sense out of your mess." But he was equally clear-eyed about the inverse risk: as AI models become more sophisticated, governance of regulated and confidential data becomes more critical, not less. PII management, model auditability, and the ability to defend decisions under regulatory scrutiny are non-negotiable in healthcare and financial services.

Snowflake and Databricks Continue to Lead, Together

Despite intense competition from hyperscaler-native offerings, Snowflake and Databricks maintain strong positions across regulated industries, largely because they let organizations operate across AWS, Azure, and GCP without locking into any one cloud provider's native stack. "That would give initial momentum to Snowflake, that it maintained this cloud-agnostic posture, while at the same time being probably the best-in-class next generation data warehouse," noted one panelist, who indicated his organization deliberately avoids native offerings like BigQuery for this reason.

Databricks benefits from its open-source lineage, particularly its best-in-class curation of Apache Spark and MLflow for pre-generative AI (pre-GenAI) machine learning workloads. For financial services firms, Snowflake's curated third-party data marketplace provides additional strategic value. "I believe they are uniquely positioned to capture significant market share, particularly in the financial industry, because of this 'data mart.'" Although Snowflake has narrowed the gap with Databricks on AI capabilities, most enterprises use both in parallel, each optimized for different workloads, pricing models, and performance requirements. The skill transferability across both platforms also reduces retraining burden. "It's easy to train our engineers with AI capabilities when they already know Snowflake or Databricks from their past experience."

Storage Is Strategic Again

Storage was supposed to be a solved problem, a commodity line item managed quietly in the background. AI changed that calculus. As workloads scale and data complexity grows, organizations are reassessing their storage and streaming architectures with fresh urgency.

In financial services, real-time streaming has long been foundational. Healthcare has historically adopted streaming only for niche use cases, but AI is forcing a broader reassessment. In regulated environments, storage is not just a performance question. It is a compliance one. Tiered storage architectures (L1, L2, L3), validation certifications, and data residency requirements all factor into vendor selection.

MiniO and Pure Storage are emerging as preferred options in this environment. "MiniIO and Pure Storage have become more cost-palatable over time. That gives us kind of a comfort level of actually scaling them, even though they're not the lowest cost." In financial services, the calculus is about efficiency at scale. "Storage has been relegated to commodity for a long time, but because of the sheer scale, it is becoming material again in terms of squeezing better efficiencies. That's why we work with ODMs such as Broadcom and Supermicro on a white box strategy."

The Takeaway: Sequence Before You Automate

The executives on this panel are not pessimists about AI. They are practitioners who have moved past the hype and into the hard work. The consistent message: AI success in the enterprise is won through sequencing, not speed. Identify the use cases where data quality is already strong. Build the semantic layer that bridges your heterogeneous architecture. Invest in phased roadmaps with clear human oversight at each stage. Accept that governance is not a legacy concern. It is an AI prerequisite.

Key Takeaways

Data management challenges from the self-service BI era are directly relevant to AI. Enterprise AI adoption continues to be limited by the same decentralized data landscapes, governance issues, and integration complexities that once challenged self-service BI.
Distributed, multi-cloud architectures are the norm. Organizations are embracing heterogeneous, multi-cloud data environments, often blending Snowflake, Databricks, and other platforms, as they accept that data will remain distributed rather than fully centralized.
Clear AI use cases matter more than technical solutions. The biggest barrier to AI success is often strategic and conceptual, not technical, requiring phased roadmaps, realistic expectations, and careful alignment of use cases with data readiness.
Legacy governance vendors are losing ground to hyperscalers. Vendors like Informatica and Collibra are seeing meaningful Net Score declines as hyperscalers absorb more governance, lineage, and data management capabilities into their cloud ecosystems.
Storage is strategically important again. Growing AI workloads are pushing organizations toward validated, cost-efficient, scalable storage solutions, including Pure Storage and MinIO, across regulated industries.

Opinions about AI's transformative potential are everywhere. What is harder to find is the unfiltered view from technology leaders who are building it under real constraints, with real data, in real regulated environments. That is exactly what ETR's research is designed to surface.

View full post