Claude Code Context Management: 3 Rules for Long Sessions

I spent an hour on an AI session only to discover at the end that the agent had rewritten a file we'd already fixed at the start. Not because it's broken — the context window filled up, and the model lost track of what happened early in the conversation. Since then I treat the context window like RAM: when it's full, you get swaps. So I built a session hygiene around that.

Why 200K tokens isn't the same as 200K tokens of memory

Claude Code's context window is 200,000 tokens. That sounds like a lot. But autoregressive models don't work like a database. Every next token is predicted based on the entire preceding context, and as that context grows, attention gets spread thinner. It's not a bug — it's the physics of how transformers work.

In practice: a file I opened at the start of a session isn't in "focus" by exchange number 50. The model still references it, but with increasing inaccuracy. Early architectural decisions fade. Agreements about what not to touch blur.

Claude Code context management isn't about working around a model limitation — it's an engineering discipline, the same way managing memory in a long-running process is an engineering discipline.

Three symptoms of context drift

Context drift is what happens when a model's effective attention on early context degrades as the session grows. Three symptoms indicate it's happening:

Repeats. The agent proposes a change that was already made 10-15 exchanges back. Not identical — slightly different, because context degrades unevenly.

Regressions. Code that worked after an early-session fix is broken again. The agent reverted it, or applied a patch on top of an already-corrected version.

Confident wrong answers. The model answers without caveats, but the answer contradicts what was established earlier. This is the most expensive signal — I've burned 30-40 minutes here more than once before understanding what was happening.

Any of these three means: restart the session. Continuing costs more than starting fresh.

Rule 1: one session, one scope

The best tool for context management isn't something you do inside a session — it's decomposition before the session starts.

When I built my SSI automation system with 13 cron tasks, I could have done it in one long session. Instead, each task got its own isolated session with its own TASK.md, its own set of open files, and its own definition of "done." Result: zero incidents of "agent rewrote something already finished." Zero cross-task regressions.

The signal that a task needs splitting: it touches more than three or four files simultaneously. Not because the AI can't handle it — but because the context isn't infinite, and paying for that limit at the end of a session, when everything's done, is the most expensive way to learn it.

Less scope per session. More reliable results.

Rule 2: external memory instead of long context

The second tool is moving permanent context out of the conversation and into files.

CLAUDE.md in the project root is the anchor. It holds: what can't be touched, naming conventions, critical dependencies. Not documentation for humans — instructions for the agent that load at the start of every session.

For each task, a separate TASK.md. Minimum three sections: the task in one sentence (no "and also"), an explicit list of what not to touch, and a concrete completion criterion — not "it works" but "endpoint /api/products returns 200 with no errors in the logs."

This file replaces 5-10 setup exchanges at the start of the session, and doesn't occupy context window space each time — the agent reads it once at the start.

In Bitrix projects this matters especially: there are a lot of implicit relationships — event handlers, AddEventHandler calls scattered across modules, D7 vs old API usage patterns. Without a written constraint, the agent will eventually touch something it shouldn't.

Rule 3: restart without losing progress

A structured handoff is a minimal prompt that carries forward only current state — no history, no decisions, no explanations. When I see drift signals, I restart using one.

I have a handoff prompt template I reuse:

You're continuing the task: [name]
What's done: [2-3 lines, facts only]
Current file: [path]
What's left: [one action]
What not to touch: [list]

No explanations of why, no history of decisions — just facts about the current state. The agent reads this and continues from the restart point without reinterpreting earlier agreements.

A restart takes under a minute. Progress lost: zero. Frustration from a degraded session: also zero, because I restarted early.

When this overhead isn't worth it

For short tasks — fixing one file, adding a method, correcting a typo — all of this is overkill. A session under 20 minutes with 10-15 exchanges doesn't degrade enough to matter.

These three rules are for tasks that run an hour or more, involve multiple files with dependencies between them. That's where context becomes the bottleneck, not the model's capability.

Context window is a resource. Treat it like RAM: you can open 40 tabs, but the browser will start swapping. Session hygiene isn't a constraint on AI — it's how you get reliable results from it on long tasks.