← All articles

Behind the AI glitter, the implementation still needs to be done

The unglamorous half of every AI deployment is the part that decides whether it works. A practical test for what is usually missing from AI proposals.

Painterly editorial illustration of a man in the left third of the frame, standing in a quiet interior with a low wooden table, two chairs, and a closed notebook; behind him, an open doorway lets warm peach light spill into the room.

In November 2024, I led a wellbeing-design workshop with seven teams of clinicians, researchers, and wellbeing teachers. Ninety minutes, flipchart paper, markers and tape. The pattern in those posters is one I have been watching in large IT organisations for thirty-five years. When it is missing, the new initiative does not land.

Seven concepts came out. A randomised cross-role lunch programme. A break-prompt system combining SMS and signage with dedicated five-to-ten minute physical spaces. A drop-in space staffed by chaplains and arts practitioners. An improv-based interprofessional team-building app. A discussion-and-role-play intervention for early-career professionals. A meta-platform for tailoring interventions across contexts. And an AI-trained communication coaching tool that role-plays difficult conversations.

That last one was designed in November 2024. It would be straightforward to build today. A capable foundation model with a system prompt describing the role-play scenario. Voice in and voice out, so it feels like a real conversation. A thin web interface around the whole thing. A clinician working with a coding assistant could ship a prototype over a long weekend. None of those pieces were ready, fast enough, or cheap enough in November 2024.

The weekend produces the glitter. Privacy review, clinical safeguards, and validation are the unglamorous half, and they do not fit in a weekend.

But building is the easier half of the story. What struck me wasn't what was on the posters. It was what wasn't. None of the seven named outcome metrics. None named trauma-informed safeguards. None mentioned cost or equity for night-shift, part-time, or locum staff. This is not a criticism of the cohort. It is the structural shape of implementing wellbeing in the field. The work needs two halves: an idea, and an implementation plan that survives contact with a real organisation. The cohort produced seven good ideas in ninety minutes. The implementation half is what comes next.

Glasgow and colleagues named this gap in 1999 and called it RE-AIM. Five questions you can ask of any intervention: Reach. Effectiveness. Adoption. Implementation. Maintenance. Each tests whether an idea on paper will translate to real-world impact in a real organisation. Twenty-five years later, a room of inventive practitioners with markers still hits the same gap. The questions matter even more with AI in the loop, because the AI sounds confident regardless of whether the output deserves that confidence.

In IT, the pattern is older than the current AI conversation. A platform getting deployed is not the same as a platform getting used. The rollouts that succeed have an implementation team behind them: people whose actual job is to train the users, embed the tool into the daily workflow, and handle the awkward cases the launch demo never showed. The rollouts that fail tend to skip that work, or assume it will happen on its own.

What an implementation team actually does is harder to specify than what a build team does. They run the training session where someone in the back row grumbles. They sit with the team in week three when the new workflow feels worse than the old one. They notice when a senior practitioner has quietly stopped using the tool, and they ask why. None of that is in the demo.

What AI changes is the speed and cost at which a credible-looking demo can be produced. That is genuinely new. What AI does not change is the need for the implementation half. AI can help with parts of that work, like training materials, onboarding documents, or chat-based assistants. What it does not yet do is convince people to adopt the tool. That is still the part that decides whether the demo ever lands. Done well, the project realises its full potential. Done badly, the tool gets lip service and the value goes unrealised.

If you are commissioning an AI project, the question I would hold close is this: who owns the implementation half? The training, the embedding, the change to how people actually work, the awkward six months after launch where the tool either becomes part of the practice or becomes a screenshot in someone's leaving deck. If you cannot name that team, you are budgeting for the model and the prompt. The other half is the part that decides whether it works.

A practical test before you sign off on the next AI proposal is to look at the line items. Model access, prompt engineering, and a pilot cohort are usually there. Change management, sustained training, and the headcount that owns the awkward six months after launch are usually not.

If the proposal has not budgeted for what happens in month seven, it has not budgeted for whether the thing works.


Reference: Glasgow RE, Vogt TM, Boles SM. Evaluating the public health impact of health promotion interventions: the RE-AIM framework. American Journal of Public Health 1999;89(9):1322-1327. Resources and ongoing scholarship at re-aim.org.

Thoughts on this piece? Join the conversation on LinkedIn.

Discuss on LinkedIn

Last updated: 14 May 2026