Constraint decay in AI coding agents: why your rules vanish mid-session and how to fix it
I gave Claude Code a clear brief: refactor a PHP sync module in a Bitrix project. Type everything strictly. No mixed. No global state. All event handlers through explicit classes implementing a defined interface.
The first four methods were exactly what I asked for. Clean signatures, int|null where needed, array over mixed.
An hour later, at method eleven, I see mixed $data. Then mixed $response. Then a function accepting $config with no type at all. Everything I specified at the start had dissolved.
This isn't a Claude bug. It's a pattern I've seen in every LLM I've used for backend work over two years. I call it constraint decay: the gradual erosion of architectural rules as session context grows.
What constraint decay is — and why it's not about running out of context
Constraint decay is the tendency of LLM coding agents to stop following architectural rules specified at the start of a session as the context grows longer. It isn't a context window problem. Claude 4 handles 200k tokens. The model can still see what you wrote at the start of the session.
The issue is statistical weight. When a session starts, your 200-token constraint block makes up a significant fraction of active context — it has strong influence on the next generation. By the time you've added 80,000 tokens of code, file contents, and responses, those same 200 tokens carry 0.25% of the weight. The attention mechanism works fine; it's just competing against everything else.
Constraints don't disappear instantly. They fade gradually — and that's more dangerous than a sudden failure. Gradual decay means the code still looks reasonable. Tests pass. First review might miss it. The architectural violations surface later, when they're harder to trace.
What it looks like in a real PHP session
In Bitrix projects, this is what constraint decay produces in practice.
I specify at session start: all event handlers as standalone classes implementing EventHandlerInterface. No anonymous functions in AddEventHandler.
Three classes in: perfect compliance.
Five classes in: Claude generates this:
\Bitrix\Main\EventManager::getInstance()->addEventHandler(
'sale', 'OnSaleOrderSaved',
function($event) {
// 40 lines of inline logic
}
);
Exactly what I said not to do. I hadn't changed the task. I hadn't asked for exceptions. The constraint had simply faded.
In another session, I required return types everywhere, PHP 8.1, declare(strict_types=1) in every file. By file twelve, two methods had no return type declarations at all — not the wrong type, just absent.
This happens across different models and different sessions. When the task is long and the context is dense, drift is the rule, not the exception.
Why long free-form sessions are a production risk
The problem isn't that AI writes obviously bad code. It writes code that looks fine. Passes tests. Passes first review.
The issue is violated invariants that aren't immediately visible. A mixed type where int should be doesn't break a unit test. It breaks business logic three call levels deep in a runtime edge case. An anonymous handler instead of a class doesn't fail in development. It creates a problem during refactoring or test coverage work three months later.
In 60+ minute sessions where I haven't applied re-anchoring (more on that below), I find violations of initial constraints in roughly 7 out of 10 cases. Not catastrophic violations — but things that need fixing before merge.
For production PHP and Bitrix code, the consequence is: a long unchecked AI session produces code with correct syntax and incorrect architecture. That's a harder category of bug to catch.
Task isolation: keep context chunks short
Task isolation is the practice of breaking large AI coding sessions into smaller sub-sessions, each with its own complete constraint specification, to prevent constraint decay from accumulating.
I break any task that will touch more than five files or generate more than 500 lines of new code into subtasks. Each subtask runs as a separate session with full constraints stated at the start.
"Refactor the entire sync module" becomes three tasks:
- Refactor
SyncEventHandler(one class, one file) - Refactor
ProductMapper(constructor injection only) - Integration test for both
Each session starts clean. Constraints have full weight because there's almost no competing context yet.
It's slower than one long prompt. The trade-off is clear: far fewer pre-merge corrections. On the 28k SKU project, I switched to this approach around month four — and the frequency of "why is this typed mixed?" comments in review dropped by roughly three times.
Re-anchoring: restoring constraints mid-session
Re-anchoring is the technique of reinserting your original constraint block into an ongoing AI session to counteract constraint decay without restarting the session. It works by repositioning the constraints at the end of the context, where they carry statistical weight again.
Sometimes a task genuinely can't be split — complex refactoring where the full context needs to stay in view. For these cases, re-anchoring is the alternative.
At regular intervals, I add a constraint reminder block to the chat:
Constraint reminder for this session:
- Strict PHP 8.1 typing everywhere. declare(strict_types=1) in each file.
- No mixed. All parameters must have explicit types.
- All event handlers: standalone classes implementing EventHandlerInterface.
- No global variables.
Continue with these rules in effect.
Not an explanation, not motivation — just a repositioning of the constraints at the end of the context, where they carry statistical weight again.
I do this every 20–30 minutes in long sessions, or after every three to four generated files, whichever comes first. Takes 30 seconds. Saves 20 minutes of corrections.
Re-anchoring works not because it "reminds" the model of anything. It works because it moves the constraint block back to the part of the context that has the strongest influence on what gets generated next.
What this means if you're just starting with AI in backend work
Don't judge quality by the first two or three files. They're almost always good. Look at file ten, twelve, fifteen. That's where your workflow either holds or doesn't.
Add architectural rule compliance to your review checklist. If you have a CLAUDE.md or equivalent, verify the agent is actually following it throughout — not just at the start. Automated checks for return types, interface implementation, and pattern adherence catch what visual review misses.
Keep the line between "AI writes code" and "AI designs architecture" explicit. Constraint decay is an argument for what AI does reliably: generating code inside tight constraints. It doesn't disqualify the tool. It defines the conditions under which the tool is worth trusting.
The constraints have to come from you. And you have to make sure they don't dissolve before the task is done.
I've written more about managing session context and the broader question of what to delegate to AI in production — both of which this connects to directly.
*Constraint decay doesn't make AI unusable in production. It makes unchecked AI unusable. The difference matters.*