The Books That Don't Balance

The Books That Dont Balance

0:00

/1213.608

This is part IV of a IV part series on the state of AI. If you haven't already, please read part I.

Three times now, this series has shown you a number and—despite the angles singing and the future EBITDA claims—told you it meant something was wrong.

The first piece traced what nine years of data have consistently failed to show: that advances in AI model capability, and the broad adoption that followed, cause any corresponding movement in the proportion of organisations capturing real financial value from artificial intelligence—a proportion lodged stubbornly between five and twenty per cent while the technology generation-shifted three or four times beneath it. The second described the permanent pilot—a quarter of organisations in production, just over half of them forever expecting to arrive there next quarter, an expectation that has recurred in survey after survey for the better part of a decade. The final set showed that a third of organisations claim deep transformation despite the fact that a majority have not redesigned a single job.

Diagnosis, however, is the easy part. A reader who has followed the argument this far is entitled to the question the series has so far declined to answer: if this is the pattern, and if it has held for nine years across every generation of the technology, what is a board or a chief executive actually supposed to do about it?

This is that piece. It is longer than the others because the prescription is harder than the diagnosis, and because a finale that merely restated the problem at greater length would be its own small act of theatre. What follows is not a framework, a maturity model, or a five-step path to AI excellence. The market is well supplied with those, and their proliferation is itself a symptom of the problem. What follows is an attempt to say plainly what the organisations capturing value are doing differently, and what a board or an executive who wanted to join them would have to change.

What the Numbers Were Telling Us

The three numbers look like three different findings. In reality they are one finding, photographed from three angles.

The value-capture gap, the permanent pilot, and transformation theatre are not separate failures of separate kinds. They are three symptoms of a single deficit, and the deficit is rarely technological. This is the part most executives find hardest to accept, because it cannot be fixed by procurement. Organisations capturing real value from AI have access to precisely the same models, the same vendors, the same consultants, and the same cloud infrastructure as the organisations capturing none. The frontier model is a commodity; anyone with a corporate credit card can rent the best one in the world by the hour. What separates the two groups is not what they have bought. It is what they are able to do with it. And the hard truth. In some cases this means replacing the board and / or the CEO.

McKinsey's data makes the point more precisely than rhetoric can. The organisations it classifies as high performers—the six per cent attributing meaningful EBIT impact to their use of AI—are distinguished from the rest not by their technology but by a cluster of organisational practices. They redesign their workflows rather than bolting AI onto the work as it already exists; fundamental workflow redesign is among the strongest single predictor of value capture, and not just in the realm of AI. They have senior managers who demonstrate genuine ownership rather than ceremonial sponsorship. They have defined processes for determining when a human must check the machine's output. They track the value of what they deploy, and they know which initiatives are earning their place. None of these is a technical capability. Every one of them is a governance and management capability. The high performers are not better at AI. They are better at running organisations, and they have applied that competence to AI.

This is also why the headline numbers appear to contradict one another, and why directors are right to feel they are being whiplashed. One widely cited study reports that the overwhelming majority of enterprise AI initiatives fail to show measurable return; another, published weeks later, finds that the great majority of early adopters of agentic systems are seeing positive returns. Both are accurate. They are simply describing a different view of the same scene. The second studies the Formula 1 of digital adoption—elite engineering, tightly controlled conditions, a pit crew of specialists, generous budgets—and reports, unsurprisingly, impressive lap times. The first studies the daily commute, where data is messy, appetite is uneven, and cultural readiness varies wildly. The average board is not governing a pit crew. It is governing a commute, and it must calibrate its expectations to the road it is actually on rather than the circuit the vendor is selling.

The quiet finding underneath nine years of survey data is considerably more uncomfortable than any of the headline figures. A technology gap can be closed by spending. A capability gap cannot. The organisation that has spent a decade failing to convert old technology into value will not convert new technology into value either, because the thing it lacks is not the technology. It is the institutional capacity to deploy anything well: to decide which problems are worth solving, to change the work so the solution can take hold, to govern the result once it is running, and to stop it when it stops earning its keep. The pattern is an old one in the strategy literature, where it travels under the name of active inertia—the tendency of once-successful organisations to respond to change by accelerating the routines that made them successful, rather than questioning them.

There is a more candid name for what most of this produces at board level, and it is worth using. It is risk washing: the appearance of control substituting for the substance of governance. Nearly a third of boards still do not have AI on the agenda, and two-thirds of directors admit limited to no knowledge of it. Beneath that veneer, staff are effectively a law unto themselves—the better part of half of all employees have uploaded sensitive company information into public tools, most commonly in the very organisations that have banned them from doing so. The policy exists; the behaviour it claims to govern does not even notice it. A board that has issued a stern all-staff email and filed a glossy ethics statement has not governed anything. It has decorated the problem.

The most recent data sharpens this from an observation into something closer to a warning. Three-quarters of companies now intend to deploy autonomous agents—systems that do not merely recommend a course of action but take it—within two years; barely a fifth report a mature model for governing them (Deloitte, 2026). The gap between those figures is the value-capture gap wearing a more dangerous costume. An organisation that could not govern a recommendation engine is proposing to deploy systems that act in the world on its behalf, and to do so before it has built the capacity to govern them. The pattern that produced nine years of disappointing returns is about to be applied to a class of technology whose failure modes are not merely disappointing. This is the stake, and it is why the question of what to do differently has stopped being academic.

What a Board Can Actually Do

A board cannot write code, redesign a workflow, or integrate a system. Its instruments are different, and in this domain they are badly underused. The board's task is not to approve the AI strategy. It is to test, repeatedly and without sentiment, whether the organisation has built the capacity to execute one. That is a different activity from the one most boards perform, which is to receive a quarterly update, observe that the roadmap is green, and move to the next item on the agenda.

The starting point is to stop treating strategy as evidence of capability. A strategy deck describes what management intends; it says nothing about whether the organisation can deliver it. The questions that reveal capability are operational rather than strategic, and they are uncomfortable precisely because they are answerable. Not "what is our AI vision?" but "what is being done differently in the business today than a year ago, and can you show me?" A leadership team that has genuinely changed something can answer in a sentence. A team that has been performing transformation will reach for the roadmap. The reach for the roadmap is the answer.