Why Your Portfolio Company's AI Initiative Is Stalling

Over $200 billion has been invested in AI data centers and semiconductors since 2020. Every PE firm has an AI thesis. Every operating partner has been asked to identify AI use cases across the portfolio. Every portfolio company CEO has been told that AI is a value creation lever.

And most of it is going nowhere.

Bain found that only 20% of portfolio companies have operationalized GenAI with concrete, measurable results. The rest are stuck somewhere between pilot and production. Running proofs of concept that never graduate. Testing tools that work on demo data but fail on real company data. Hiring AI talent that spends 80% of their time on data wrangling instead of model development.

Meanwhile, 40% of PE professionals cite AI deployment as a decade-long challenge for their portfolio. Not a quarter-long challenge. A decade-long one.

This is not a technology problem. The AI tools work. GPT-4 can write. Claude can reason. Machine learning models can predict. The models improve every quarter. The infrastructure is getting cheaper. The APIs are getting easier.

The problem is almost always underneath the AI layer. It is the data.

The uncomfortable truth about AI readiness

Here is what I have observed across dozens of mid-market companies that attempted AI initiatives.

The pilot succeeds. A team builds a proof of concept using clean, curated data. It works. The demo is impressive. Leadership gets excited. Budget is approved.

Then the team tries to operationalize it on real company data. Customer records have duplicates and missing fields. Revenue data does not reconcile across systems. Product catalogs are inconsistent between the ERP and the e-commerce platform. Historical data has gaps from a system migration 18 months ago.

The AI model that worked perfectly on clean demo data produces unreliable results on messy production data. The team spends months cleaning, transforming, and reconciling. By the time the data is good enough, the business priority has shifted, the champion has moved on, or the board has lost patience.

This pattern repeats because companies invest in AI from the top of the stack (models, tools, talent) instead of the bottom (data quality, data infrastructure, data governance). It is the equivalent of hiring a Formula 1 driver and handing them a car with three flat tires.

Five reasons AI initiatives stall

1. No single source of truth for customer data

Every AI use case that touches customers, which is most of them, needs a reliable customer data foundation. Personalization, churn prediction, cross-sell recommendations, customer service automation. All of these require a unified view of who the customer is, what they have purchased, how they have interacted with the company, and what their current status is.

Most mid-market companies do not have this. They have customer data in the CRM, transaction data in the ERP, usage data in the product platform, support data in the ticketing system, and marketing engagement data in the email tool. None of these systems agree on basic facts like how many customers the company has or what their annual spend is.

An AI model trained on fragmented customer data produces fragmented results. A churn prediction model that does not see support ticket history will miss the most important signal. A personalization engine that does not know a customer’s full purchase history will make irrelevant recommendations.

What to do about it. Before any customer-facing AI initiative, build a customer master that reconciles identity across your core systems. You do not need a customer data platform. You need a reconciled list that maps each customer across CRM, ERP, and product systems with a unique identifier. Start with your top 200 customers by revenue. That covers the 80% that matters.

2. Manual data pipelines that break when you need real-time

AI initiatives often require data that is fresher than what your current infrastructure delivers. A predictive model needs yesterday’s data, not last month’s. A customer-facing AI feature needs real-time data, not a batch export that runs overnight.

Most mid-market companies run their data movement through manual or semi-manual processes. Someone exports a CSV from the ERP every morning. Someone else uploads it to the BI tool. A scheduled script moves data from the CRM to the data warehouse once a day, when it does not fail silently.

These pipelines were built for human-speed reporting. They are adequate for a monthly board deck. They are not adequate for AI systems that need current, complete data to function.

What to do about it. Audit your data pipelines. Map every point where data moves between systems. For each one, document whether it is real-time, batch (with frequency), or manual. Identify the AI use cases that need real-time or near-real-time data. For those specific pipelines, invest in automation. You do not need to rebuild everything. You need to upgrade the three or four pipelines that feed AI use cases.

3. Data quality too low for ML training

Machine learning models learn from historical data. If that data contains errors, inconsistencies, gaps, and duplicate records, the model learns the wrong patterns. The technical term is “garbage in, garbage out,” but the financial term is more relevant: wasted investment.

A revenue forecasting model trained on three years of inconsistently categorized revenue will produce inconsistent forecasts. A demand prediction model built on inventory data with unexplained discontinuities from a system migration will produce unreliable predictions. A customer segmentation model trained on a customer master with 15% duplicate records will produce meaningless segments.

The data quality bar for ML is higher than the bar for human reporting. A human analyst can look at a chart with a strange spike and think “that was the ERP migration, ignore it.” A model cannot. It treats the spike as signal and adjusts its predictions accordingly.

What to do about it. Before starting any ML initiative, profile the training data. Measure completeness (what percentage of records have all required fields), consistency (do the same metrics match across sources), accuracy (spot-check a sample against source documents), and timeliness (how current is the data). Set minimum thresholds for each dimension. If the data does not meet them, fix the data first. This is not optional. It is the prerequisite.

4. No governance framework for AI outputs

Even when the data is good enough and the model works, companies stall because they have not thought through how AI outputs integrate with business decisions. Who is accountable when an AI recommendation is wrong? What is the escalation path when the model produces an unexpected result? How are AI outputs audited?

This is not hypothetical. A portfolio company deployed an AI pricing tool that recommended a 30% price increase for a customer segment. The recommendation was technically correct based on elasticity data, but it did not account for a long-term strategic relationship with the segment’s largest customer. Nobody caught it because there was no review process for AI recommendations. The customer escalated directly to the CEO.

The governance gap is especially acute in regulated industries. Financial services, healthcare, and government-adjacent businesses face specific requirements around explainability, bias testing, and audit trails for AI-driven decisions. Most mid-market companies have not built these frameworks because they did not need them before AI.

What to do about it. Before deploying any AI system that influences customer-facing or financial decisions, define the governance framework. At minimum, cover four elements. Decision authority (which decisions can AI make autonomously, which require human review). Monitoring (how do you detect when the model produces anomalous results). Audit trail (can you explain any AI-driven decision after the fact). Escalation (what happens when something goes wrong). This framework does not need to be 50 pages. Two pages covering these four elements is enough to start.

5. Skills gap between data engineering and AI engineering

Most mid-market companies have analytics people. They can build reports, create dashboards, and run ad hoc analyses in Excel or Tableau. What they typically do not have is data engineers who can build reliable data pipelines, ML engineers who can train and deploy models, or AI specialists who can evaluate and implement LLM-based solutions.

The response is often to hire an AI person. A single senior hire expected to identify use cases, build the data infrastructure, develop models, deploy them, and manage the governance framework. This person either burns out, spends all their time on data plumbing, or delivers a pilot that never gets to production because the supporting infrastructure does not exist.

What to do about it. Sequence the hires. The first hire should be a data engineer, not an AI engineer. Someone who can clean up the data pipelines, build the integrations, and create the reliable data foundation that AI requires. Once that foundation exists (typically six to nine months of work), then hire or contract the AI expertise. Alternatively, use the data engineer to prepare the environment and bring in a specialized AI team for a time-boxed engagement to build and deploy specific use cases.

Vista Equity Partners requires quantified GenAI goals as part of annual planning for their portfolio companies. This is the right approach, but the quantification needs to include the data readiness investment, not just the AI investment. A goal of “deploy AI-driven customer churn prediction by Q3” means nothing if the underlying customer data is not unified, clean, and accessible.

The sequence that works

The companies that successfully operationalize AI follow a consistent sequence.

Phase 1. Fix the data (months 1 through 6). Reconcile core data sources. Build reliable pipelines. Clean the data that will feed AI use cases. This is not an AI project. It is a data project that enables AI.

Phase 2. Pick one use case and deploy it (months 4 through 9). Overlap with Phase 1. Choose the use case with the clearest ROI and the most ready data. Deploy it to production with governance. Measure results.

Phase 3. Scale (months 9 through 18). Use the first deployment as proof of concept for the organization. Apply the same data and governance framework to additional use cases. Each subsequent deployment is faster because the foundation exists.

The companies that skip Phase 1 and go directly to Phase 2 are the ones stalling. They end up in an endless loop of pilot, data issues, restart, pivot, new pilot, same data issues.

What to tell the board

If you are an operating partner or a portfolio company CEO being asked about your AI strategy, here is the honest answer.

“We have identified three specific AI use cases with quantified ROI. The first requires customer data that is currently fragmented across four systems. We are investing $X and Y months in building a unified customer data layer. Once that layer is in place, the AI deployment is a 60-day project. We expect the first use case in production by [specific date] with projected annual impact of $Z.”

That is a credible answer. It acknowledges the data prerequisite. It puts a timeline and a dollar figure on the infrastructure work. It connects the AI investment to a specific business outcome.

What does not work is “we are exploring AI use cases” or “we have a pilot underway” without a clear path to production, a timeline, or a measurement framework. The board has heard that for two years now. They are running out of patience.

The bottom line

AI readiness is data readiness. The companies that operationalize AI successfully are the ones that invest in the data layer first, pick specific high-ROI use cases, and deploy with governance. The companies that stall are the ones that buy AI tools before fixing the data those tools need to consume.

The $200 billion flowing into AI infrastructure will create enormous value. But that value will accrue to the companies whose data is ready to absorb it. For everyone else, it will remain an impressive demo that never reaches production.

For more on the connection between data readiness and AI capability, see AI Readiness Starts with Data Readiness.

For a practical framework on getting the data foundation right, see One-Page Data Value Creation Plan.

For a weekly brief on AI deployment, data strategy, and PE portfolio operations, subscribe to Inside the Data Room.