April 10, 2026 8 min read ai

Why Faster Coding Made Engineering Management Harder

AI made implementation cheaper, but it increased the management load everywhere downstream: review, predictability, flow, and developer experience.

When vendors start inventing new engineering management frameworks, it’s usually because the old assumptions stopped fitting reality.

This week LinearB launched APEX, a framework organized around AI leverage, predictability, flow efficiency, and developer experience. I don’t think most teams need to adopt APEX as a branded thing. I do think the launch is a useful market signal. When a company that has spent years selling engineering metrics decides it needs a new management model, it’s usually because the management job changed under everyone’s feet.

AI made implementation cheaper. It did not make engineering management simpler.

If anything, it made the job more of a systems problem. Code shows up faster. Review queues fill faster. Weak specs create rework faster. Integration mistakes reach QA faster. Bad assumptions hit production faster. The management question is no longer “Are developers using AI?” It is whether the team still ships predictably without drowning in verification and cleanup.

That is the shift I think a lot of teams still underestimate. They are treating AI as a local productivity upgrade inside the coding step. Management is still operating as if implementation capacity is the scarce resource. On many teams, it no longer is.

The bottleneck moved downstream

For years, engineering management could get away with a fairly simple story. If you wanted more output, you needed more implementation capacity. Hire good people. Reduce interruptions. Improve planning. Keep the roadmap realistic. Most of the management machinery assumed writing code was expensive and relatively scarce.

AI changes that assumption.

When code generation gets cheaper, the cost does not disappear. It moves. Somebody still has to verify the change, review the tradeoffs, test the edge cases, integrate the work, deploy it safely, and support it after release. If those parts of the system do not speed up too, all you did was move the queue.

DORA made this point directly in its recent piece on balancing AI tensions: time saved in code generation can be consumed again in audit, verification, and downstream delivery work. That matches what a lot of teams are feeling right now. The coding step feels faster, but the sprint does not feel calmer.

A good management smell test is this: if AI usage goes up but review latency, rework, and delivery misses also go up, you did not gain leverage. You pushed cost downstream and made it harder to see.

That is why so many AI productivity conversations feel slippery. The local experience is real. A developer can absolutely finish a first draft faster. But the organization does not ship first drafts. It ships reviewed, tested, integrated software.

The METR study is a useful warning here. In a study of 16 experienced open-source developers working on 246 real tasks in repositories they already knew well, developers using AI took 19% longer, even though before the study they expected a 24% speedup. Faster code appearance is not the same thing as faster completed work.

The real AI metric is not adoption

This is why I have become increasingly skeptical of adoption dashboards as the headline management metric.

Seat count tells you procurement happened. Weekly active users tell you curiosity exists. Prompt volume tells you almost nothing. None of those metrics answer the question leadership actually cares about: does the team still ship reliably as AI changes how work is produced?

That question is messier, but it is also the only one that matters.

If 90% of your team uses AI tools and your PR review turnaround slips from same-day to three days, what exactly improved?

If sprint commitments get less reliable because implementation starts cheaply but finishes expensively, is that progress?

If your strongest engineers are now spending their time cleaning up giant AI-assisted diffs and writing the same review comments over and over, are you compounding leverage or just hiding a quality tax inside senior bandwidth?

This is where the APEX launch is interesting. Not because LinearB discovered a magical new acronym. Because the market is clearly converging on the same shape of problem: leverage, predictability, flow, and developer experience now have to be managed together. Optimize one in isolation and the others will punish you.

DORA is saying measure impact, not just adoption. Media coverage is full of leaders backing away from seat-count theater. Vendors are trying to instrument AI contribution at the pull-request level. Different actors, same basic conclusion: the old proxy metrics are not enough anymore.

Most teams do not need another dashboard first

The second mistake I see is assuming this is primarily a visibility problem.

It is partly a visibility problem. Better instrumentation helps. But most teams do not need another dashboard first. They need a management cadence for what gets reviewed weekly, per sprint, monthly, and quarterly.

This is the part of APEX I think is directionally right. Not the branding. The cadence.

A dashboard without a review rhythm is wallpaper. It gives leaders one more place to feel vaguely informed while the actual system keeps drifting.

A more useful operating model looks something like this.

Weekly: review leverage and verification pressure

Every week, leaders should ask where AI is genuinely reducing cycle time and where it is creating new review tax.

Look for signals like:

Which changes moved from idea to merged PR materially faster than before?
Which changes came back with repeated review comments, weak tests, or obvious edge-case misses?
Are PRs getting larger because code is cheaper to generate?
Are reviewers becoming the constraint?

This is where a lot of teams fool themselves. They see more code being produced and assume the system is healthier. Sometimes the only thing that changed is that reviewers are now underwater by Wednesday.

Per sprint: review predictability

At the sprint level, the question is not whether the team started more work. It is whether the team finished what it said it would finish without unusual thrash.

That means asking:

Did AI help us complete planned work, or did it just make it easier to begin too much of it?
How much work bounced back for rework after review, QA, or stakeholder feedback?
Did scope stay stable, or did cheap implementation encourage speculative work that expanded mid-sprint?
Are estimates becoming less trustworthy because coding got cheaper but integration did not?

Predictability is the metric that keeps leadership honest. A team can feel extremely productive while becoming less forecastable.

Monthly: review flow across the whole delivery path

Once a month, step back from individual tasks and inspect where work is actually waiting.

Look across the full path:

code review
CI and test pipelines
QA and validation
security and compliance checks
release coordination
production follow-up

If AI is working, flow should improve across the system, not just inside the editor. If the monthly picture shows work piling up after coding, then the coding gains are being absorbed somewhere else.

This is the same pattern at the management layer. At the team level, the bottleneck often migrates from typing to verification, coordination, and release management.

Quarterly: review developer experience and role design

Quarterly is where you ask the uncomfortable questions.

What is this doing to the actual experience of engineering work?

Are senior engineers turning into permanent code janitors?
Are junior engineers learning the system, or mostly learning how to generate convincing diffs?
Do people trust the outputs more than they should, or less than they should?
Have performance expectations quietly become unrealistic because leadership heard “AI makes developers faster” and translated that into quota thinking?

Developer experience matters here not because it is soft. It matters because a team that does not trust its workflow will compensate with heroics, manual checking, and private workarounds. That always shows up later as slower delivery and worse morale.

What the management job is now

I do not think AI made engineering management harder because there is suddenly more data to look at. I think it made it harder because the job moved up a level.

The old center of gravity was implementation throughput. The new center of gravity is system design for delivery: specification quality, verification architecture, review capacity, deployment safety, and realistic expectations around how fast the organization can absorb change.

That is a different job.

It requires leaders to care more about the shape of work than the volume of work. More about review and rework than raw generation speed. More about whether the team can keep promises than whether the IDE feels magical.

It also requires some restraint. Cheap implementation is dangerous for the same reason cheap cloud resources are dangerous. When something gets easier to consume, teams overconsume it before they redesign the surrounding system. Then they act surprised when the bill arrives somewhere else.

Why new frameworks keep showing up

So no, I do not think the main lesson from APEX is that engineering leaders need one more acronym.

I think the lesson is that the market has noticed the same thing practitioners are noticing: AI did not remove the management problem. It changed where the problem lives.

That is why new frameworks are showing up. Not because DORA was wrong. Not because SPACE is obsolete. Because faster coding exposed a set of management concerns that were easier to ignore when implementation itself was slower.

The strongest AI teams over the next few years will not be the ones with the highest adoption numbers. They will be the ones that can absorb cheap implementation without losing predictability, flow, or team trust.

That is the real management challenge now.