Studio · Est. 2009 · Web + Software + Frameworks RSS · Start a Project →
Frameworks
Article · Frameworks

AI Coding Agents in 2026: Field Notes from a Year of Production Use

Date · 2 May, 2026
Cat · Frameworks
Read · 3 min

Twelve months ago, "AI coding agent" still meant a Copilot-style autocomplete with an attitude. In 2026 it means a process you brief once and walk away from. We've shipped meaningful production work — refactors, migrations, full feature deliveries — through agents this year. Here are the lessons that survived contact with reality.

The mental model shifted

Stop thinking of an agent as a faster typist. Think of it as a junior engineer with infinite tabs open and no biological need to sleep. The bottleneck is no longer code generation speed — it is the quality of your instructions and the precision of your verification step.

The teams getting wins in 2026 share three traits:

  • They write tickets in the structure an agent can parse — context, constraint, acceptance criterion.
  • They invest in test scaffolding before invoking the agent, not after.
  • They review diffs the way a senior engineer reviews a junior's PR — not by reading every line, but by spot-checking the high-risk seams.

What works in production

Bounded refactors

Renaming a symbol across 200 files, adding a parameter to every call site, converting a logging API — agents nail these. They beat any IDE refactor on multi-language repos and never miss a stale string-template usage.

Test scaffolding

Pointing an agent at an untested module with a sentence like "give me a Jest suite that covers the public API and the error branches" produces 80% of the boilerplate. The 20% you fix yourself is the part where you remember what the module is actually supposed to do.

Migration scripts

Schema migrations, API version bumps, framework upgrades. The agent reads the diff of the upstream changelog, scans your code, and produces a plan. You read the plan, push back on three things, and you're done in an afternoon what used to be a week.

What does not work

Agents still get lost in ambient context — codebases where the rule about how things are done is implicit, scattered across Slack threads, or stored exclusively in one tenured engineer's head. If your team can't articulate why the build pipeline does what it does, the agent will not magically intuit it.

They also struggle with cross-cutting product decisions. "Refactor checkout to use the new pricing engine" is a product question dressed as an engineering task. The agent will pick a plausible interpretation and run with it. Sometimes that interpretation is wrong in a way you only discover at the next billing cycle.

Cost calibration

The unit economics matter again. A long-running agent task can burn through tokens fast. Our rule of thumb in 2026: budget the cost the same way you'd budget a contractor's day rate. If a task feels like it should cost $20 and you're at $400, something is wrong — usually the agent is in a loop or hallucinating dependencies. Kill it and re-scope.

The unglamorous winners

The features we most often hand to an agent in 2026 are the ones nobody wanted to write themselves:

  1. Internationalisation passes: extracting strings, generating placeholder translations, updating templates.
  2. Accessibility audits: agent runs axe-core, agent fixes contrast/aria/labels, agent opens a PR.
  3. Dependency upgrades: the long tail of minor bumps that keep your security scanner quiet.
  4. Documentation refreshes: updating README snippets after API changes.

Where we've landed

Treat agents as force multipliers, not replacements. Our team of six ships roughly what a team of nine shipped two years ago, and the gain comes almost entirely from removing the work nobody enjoyed in the first place. Code review headcount, on the other hand, has not gone down. Senior engineers spend more time reviewing diffs and less time writing them. That is the trade.

The pattern that keeps proving itself: agents are excellent at making your codebase more like itself and dangerous at changing what your codebase is. Use them accordingly.