March 30, 2026 10 min read ai

The 5 Stages of AI Tooling Adoption Every Engineering Team Goes Through

85% of developers use AI tools. Most teams have no strategy for them. Here's a five-stage framework for what adoption actually looks like — and why almost everyone is stuck at stage two.

Every engineering team I talk to tells me they’re “using AI.” When I dig into what that means, it’s almost always the same story: the company bought Copilot licenses, a few people use Claude or ChatGPT on the side, and everyone has a vague sense that they should be getting more out of it. Nobody knows what “more” looks like.

This isn’t a tools problem. Eighty-five percent of developers now regularly use AI coding tools, according to JetBrains’ 2025 survey of 24,000 developers. The tools are everywhere. What’s missing is a vocabulary for what comes after “we have the tools.” Teams don’t have a way to talk about the gap between where they are and where they could be, so they don’t talk about it at all.

I’ve spent the last year and a half building and using AI coding systems, including an orchestration platform I wrote in F# that coordinates multiple AI agents inside Docker containers. Through that process, and through conversations with other engineering leads, I’ve noticed a pattern. Teams move through roughly five stages of AI tooling adoption. Most are stuck at stage two. The ones generating real value have made it to stages three through five, and the difference isn’t which tools they picked. It’s whether anyone designed the workflow around them.

Here’s the framework.

Stage 1: Curiosity

This is where it starts. A developer on the team tries GitHub Copilot in their editor, or pastes a stack trace into ChatGPT and gets a useful answer. They start using it more. Maybe they mention it in standup. A few teammates try it too.

There’s no organizational awareness at this stage. No budget, no policy, no discussion about how AI tools fit into the team’s workflow. It’s purely individual experimentation. The value is real but small, and it’s siloed inside whoever happened to try the tool first.

Most engineering teams passed through this stage in 2023 or early 2024. The important thing is that curiosity alone doesn’t scale. One developer getting 20% faster at writing boilerplate doesn’t change the team’s throughput in any meaningful way. The gains are personal, invisible to the org, and they plateau quickly.

Stage 2: Adoption

This is where someone in leadership says “we should be using AI.” The company buys Copilot Business licenses. Maybe they add a ChatGPT Team subscription. Everyone gets access.

And then nothing changes.

Developers use the tools the same way they used them before, just now on the company dime. The code review process is identical. Sprint planning hasn’t changed. Documentation still gets written (or not written) the same way. The team has AI tools the way they have a JIRA board: technically available, underutilized, nobody’s rethinking how work gets done because the tool exists.

This is where most teams sit today. JetBrains found that while 85% of developers regularly use AI tools, only 44% say AI is even partially adopted into their actual workflows. That gap is stage two. The tools are present but the work hasn’t adapted to them.

The numbers confirm it. DX’s 2025 report, covering 135,000 developers across 435 companies, found that time savings from AI tools plateaued at roughly 3.6 hours per week even as adoption climbed from 50% to 91%. More people using the tools didn’t produce more savings. The bottlenecks had shifted to things AI couldn’t touch: unclear requirements, slow code review, flaky CI, organizational friction. Stage two delivers a bump and then flatlines, because the tools are working around the existing process instead of being woven into it.

If your team has AI tools and you’re wondering why you don’t feel 10x more productive, you’re probably here.

Stage 3: Integration

This is where teams start being intentional. Instead of just having AI tools available, they design specific workflows around them.

What this looks like in practice: a team sets up a Claude-powered code review assistant that runs on every pull request, catching common issues before a human reviewer looks at it. Another team uses AI to generate test cases from their spec documents, not as a one-off experiment but as a defined step in their development process. Documentation gets drafted by AI from code comments and architecture decisions, then reviewed and refined by engineers.

The key distinction between stage two and stage three is that someone sat down and asked: “Where in our workflow could AI do something useful, and how do we make that repeatable?” It’s still tool-by-tool. Each integration is a specific AI tool doing a specific job. But it’s intentional, not ad-hoc.

Getting here requires a few things that stage two doesn’t. Someone needs to own the integration work. The team needs to agree on where AI output needs human review and where it can flow through automatically. There need to be quality gates, because AI-generated code reviews or tests are useful only if you’ve validated that they’re catching real issues and not just generating noise.

This is also where trust becomes a design consideration. AI suggestions need to earn credibility through consistent quality before the team will actually rely on them. That means measuring accuracy, tracking false positives, and being honest about where the AI helps and where it just adds a step.

Most teams that reach stage three get there because one person cared enough to build the integrations. It’s a fragile state. If that person leaves, the integrations often decay.

Stage 4: Orchestration

Stage four is a qualitative shift. Instead of individual AI tools each doing their own thing, the team has a coordinated system where multiple AI capabilities work together in a designed pipeline. Humans are still in the loop, but the loop is structured.

Here’s a concrete example. An issue comes in from the backlog. An AI agent reads the issue, pulls the relevant code, and generates an implementation plan. A second agent executes the plan in an isolated Docker container with the full project environment. Automated tests run against the changes. If tests fail, the agent iterates. If they pass, an AI-powered review checks the diff for correctness, style, and potential issues. The result is a pull request that a human engineer reviews, with full context about what was done and why.

This is roughly how my system Hivemind works. The point isn’t Hivemind specifically. The point is that orchestration means the whole pipeline is designed, not just individual tools. The AI isn’t assisting a human doing a task. The system is doing the task with human oversight at key decision points.

Getting to stage four requires actual infrastructure. You need isolation so tasks don’t contaminate each other. You need context management so the AI sees the right files and documentation. You need quality gates that are automated, not manual. And you need observability, because when you have multiple AI agents working in parallel, you need to see what they’re doing and where they’re stuck.

This is where the step-function gains happen. Not because the AI is faster at typing, but because the overhead of context switching, environment setup, and manual coordination has been engineered away. The team’s job shifts from writing code to defining specifications, reviewing output, and making architectural decisions.

Very few teams are here. McKinsey’s 2025 State of AI report found that two-thirds of organizations haven’t begun scaling AI beyond pilot or experimental phases. Only 6% qualify as “AI high performers” with measurable business impact, and those organizations are three times more likely to have senior leadership directly owning AI strategy. Orchestration doesn’t happen bottom-up. It needs organizational commitment.

Stage 5: Autonomy

This is the frontier. AI agents handle full task lifecycles with human oversight at decision points, not at every step.

The difference between orchestration and autonomy is where humans spend their attention. In stage four, a human reviews every pull request the system generates. In stage five, the system handles routine tasks end-to-end, and humans get involved when the task is novel, the stakes are high, or the system flags uncertainty.

Think of it like the difference between a junior developer who needs code review on everything and a senior developer who escalates when they hit something unusual. The system has enough track record and enough safeguards that you can trust it with routine work while you focus on the hard problems.

We’re early here. The tooling is evolving fast. Claude Code, Codex, and a growing number of purpose-built coding agents are all pushing in this direction. But the honest assessment is that full autonomy only works today for well-scoped, well-tested codebases with clear specifications. Throw it at a messy legacy system with no tests and vague requirements, and you’ll spend more time fixing the output than you saved.

Autonomy isn’t a destination you arrive at all at once. It’s something that expands gradually as your test coverage improves, your specifications get clearer, and your confidence in the system’s judgment grows. The teams that get here treat it as an engineering problem, not a purchasing decision.

Why Teams Stall at Stage Two

If the higher stages are where the real value lives, why do most teams never get past two?

Nobody owns the transition. AI tools get purchased by leadership and handed to developers. No one is responsible for figuring out how to integrate them into the team’s actual workflow. It’s like buying a CI server and never setting up any pipelines.

The tools work well enough individually. Copilot autocomplete is genuinely useful. ChatGPT answers questions. There’s enough daily value that it doesn’t feel like anything is missing. The gap between “useful” and “transformative” isn’t obvious from the inside.

Workflow redesign is organizational work, not technical work. Moving from stage two to stage three means changing how the team does code review, how documentation gets written, how test coverage is maintained. These are process changes, and process changes require buy-in, discussion, and iteration. Most engineering teams would rather ship features.

The conversation around AI is unhelpful. The public discourse oscillates between “AI will replace all developers” and “AI-generated code is garbage.” Neither framing gives a team anything actionable. Teams hear the hype, try the tools, don’t experience the revolution they were promised, and conclude that AI is just a nice incremental improvement. They’re not wrong about their experience at stage two. They just can’t see what the higher stages look like.

Using This Framework

The value of naming these stages isn’t to make anyone feel behind. It’s to give teams a vocabulary for a conversation most organizations aren’t having.

If you’re an engineering manager, you can look at your team and say “we’re at stage two, and here’s what stage three would look like for us.” That’s a more productive starting point than “we should use AI more.” It turns a vague aspiration into a concrete discussion about workflows, responsibilities, and infrastructure.

The stages also clarify what the actual work is at each transition. Moving from one to two is a procurement problem. Two to three is a workflow design problem. Three to four is an infrastructure problem. Four to five is a trust and specification problem. Each transition requires different skills, different investments, and different people leading the effort.

Most teams don’t need to reach stage five right now. Stage three is a realistic, high-value target for any team willing to invest a few weeks of intentional effort. Pick one workflow, integrate AI into it properly, measure the results, and iterate. That’s enough to break out of the plateau.

The teams that figure this out first will have a compounding advantage. Not because they have better tools, but because they’ve done the unglamorous work of designing how those tools fit into how people actually work.