Why do most enterprise AI pilots never reach production?

MIT's 2025 research found the failure is rarely the model. It's approach and operating discipline: messy data, no clear owner, and no path from a demo to a workflow that runs. 95% of pilots returned nothing measurable.

Is it better to build AI in-house or buy from a specialist?

The same MIT research found teams that bought from specialists reached production about twice as often as teams that built in-house. By 2025, Gartner was advising CIOs to move toward commercial solutions for more predictable value: buy, not build.

What is agent washing?

Vendors rebranding chatbots and RPA as 'AI agents' without the capability to ship them into production. Gartner estimates only around 130 of the thousands of agentic-AI vendors are genuine. The question isn't whether to bring in an operator, but which one actually ships.

Intelligence20 June 20264 min read

The forty-billion-dollar silence: why most enterprise AI never ships

The world spent a fortune teaching machines to work, and 95% of it returned nothing. A field report on where enterprise AI goes to die, the vendors faking the cure, and the discipline that ships it.

By Taha Zaheer · Founder, Beyond Partners

The world spent a fortune teaching machines to work. Then almost nothing happened. This is a short field report on where enterprise AI actually goes to die, the impostors selling the cure, and the unglamorous discipline that brings it back to life.

The silence

In 2025, enterprises poured tens of billions into generative AI. By the end of the year, 95% of it had returned nothing measurable. Not low returns. Zero (MIT, 2025). The industry even coined a name for the place these projects go: pilot purgatory. Funded, demoed, applauded, and then quietly never heard from again.

The autopsy

The autopsy held a surprise. The model was rarely the cause of death. MIT found the failure was approach and operating discipline: messy data, no owner, no path from a demo to a workflow that actually runs. And buried in the same research, a tell. Teams that bought from specialists reached production about twice as often as teams that built in-house (MIT, 2025). By 2025, even Gartner was telling CIOs the quiet part out loud: buy, don't build.

A demo is the floor. Production is the moat.

The impostors

Which sounds like good news, until you go shopping. Of the thousands of vendors now selling agentic AI, Gartner reckons only around 130 are the real thing (Gartner, 2025). The rest are agent washing: a chatbot in a trenchcoat, an RPA script with a new logo and a confident deck. The question was never whether to bring in an operator. It was which one actually ships.

The commerce twist

This bites hardest in commerce, for a reason most decks skip. An agent is only ever as good as the data it can read, and a retailer's data is scattered across a website, a handful of marketplaces, and a row of stores that don't agree with each other. Retail leads the world on AI adoption and trails it on production (industry reports). The ambition is there. The plumbing isn't.

The method

So here is the unglamorous truth we built a company around. You don't fix an organisation by pouring more AI into it. Output is never governed by how busy every part looks. It's governed by one constraint: the single place the work actually chokes. You find it. You clear it with one agent, shipped to a number. Then you find the next one. It's the oldest idea in operations, and the AI scramble forgot it entirely. Motion you can measure, not motion that looks busy.

There's a stalled workflow in your business right now. Give us thirty minutes and we'll tell you, honestly, whether it's worth clearing. Then, if it is, we ship the agent that clears it, and we run it.

Sources & further reading

Frequently asked questions

Why do most enterprise AI pilots never reach production?: MIT's 2025 research found the failure is rarely the model. It's approach and operating discipline: messy data, no clear owner, and no path from a demo to a workflow that runs. 95% of pilots returned nothing measurable.
Is it better to build AI in-house or buy from a specialist?: The same MIT research found teams that bought from specialists reached production about twice as often as teams that built in-house. By 2025, Gartner was advising CIOs to move toward commercial solutions for more predictable value: buy, not build.
What is agent washing?: Vendors rebranding chatbots and RPA as 'AI agents' without the capability to ship them into production. Gartner estimates only around 130 of the thousands of agentic-AI vendors are genuine. The question isn't whether to bring in an operator, but which one actually ships.

You bought AI. You haven’t shipped it.

We’re the operator who gets commerce AI into production and runs it. Across catalog, fulfilment and storefront. Billed on outcomes.

Book a 30-minute production read ↗