Back to blog

How I trust AI agents in production: 3 patterns from 13 cron jobs

When I handed 7 of 13 cron jobs to Claude, the first thing I built wasn't a prompt. It was a STOP file and an Inbox folder.

Not monitoring. Not eval metrics. A kill switch and a review queue.

Because AI systems don't fail in production when the LLM hallucinates. They fail when you have no kill switch and no buffer before irreversible actions.

"Do you trust your AI agent?" is the wrong question

Most articles about AI in production focus on the model: how accurate is it, does it hallucinate, which benchmark does it top. None of that is the operational question.

The real question: how is the system around the agent engineered? What happens if it fires an extra request? Who decides before the agent takes an action with consequences you can't undo?

I built an automation system of 13 cron jobs. Seven run with an approval gate, five are fully autonomous, one is manual-only. The difference isn't task complexity — it's whether the consequences are reversible.

Three patterns came out of that build. I won't skip any of them for any AI agent touching production.

Pattern 1: Stateful drafts — agent proposes, you decide

The agent never commits directly. It writes a draft to an Inbox folder, sets status: draft in frontmatter, and stops. You review. If it looks right, you move it to Queue, then Published. Each step is atomic.

This sounds like overhead. In practice, it's what makes overnight automation feel safe. The agent works while you sleep. In the morning you check the Inbox — you're reviewing, not guessing.

There's a less obvious benefit: the frontmatter status is a contract for the next run. Without explicit state, the agent reruns the same task on every tick. One field in a file prevents that.

Stateful drafts create a deliberate gap between "propose" and "commit." That gap is where meaningful oversight actually happens.

More on what to delegate to AI vs. what to keep — I let Claude write tests. I don't let it choose architecture.

Pattern 2: The stop flag — kill everything in 3 seconds

A file named cron/STOP lives in the working directory. If it exists, no cron agent runs. No requests, no actions, nothing.

touch "cron/STOP"   # disable everything
rm "cron/STOP"      # re-enable

This isn't a fallback. It's the first thing I implement before any automation.

Why? Because context shifts in ways you can't predict. You leave unexpectedly. The platform behaves oddly. It's late Friday and you'll sort it Monday. Three seconds with touch STOP.

Without this kill switch, I wouldn't run anything that acts on my behalf in public. The agent can be perfectly calibrated — context still changes, and you need the ability to stop the system before it does something visible.

Pattern 3: Hard rate limits in code, not in prompts

Limits belong in code, not in prompt instructions.

You can write "don't comment more than 3 times per day" in a prompt. In code, it looks different:

MAX_COMMENTS_PER_RUN = 3
if comments_today >= MAX_COMMENTS_PER_RUN:
    log("rate limit hit, skipping")
    sys.exit(0)

The difference matters. A prompt is an instruction the model interprets. Code is a boundary it can't cross.

Rate limits aren't just protection against aggressive behavior. They keep the system legible to you. If the agent does at most N actions per run, you understand what happened without opening a single log file.

One more thing: randomize delays between actions. Not "about 30 seconds" — time.sleep(random.randint(30, 90)). A little variance removes the pattern.

What I measured

Three months. Seven approval-gated jobs, five autonomous. Not a single case of the agent "doing something extra." I triggered STOP once when I had to leave unexpectedly. The system stopped in 3 seconds.

Stateful drafts cost about 10-15 minutes per day reviewing drafts. That's not overhead — it's deliberate oversight. I see what the system did without keeping its full state in my head.

What doesn't work

These patterns don't help if the task is badly scoped to begin with. If an agent takes an action that can't be stopped or rolled back, the STOP flag helps you next time, not this time.

Before any new task I ask one question: "If the agent does this twice, is that a problem?" If yes, either the task doesn't get automated, or step one is automatic and step two needs confirmation.

The other trap: trying to encode all safety logic in prompts. A prompt describes intent, it's not a contract. Safety invariants belong in code.

What happens when AI-generated code ships to production without enough guard rails — a real case study in The bug Claude wrote. Seven days in production.

The short version

The question isn't whether to trust an AI agent. It's whether the engineering around it is sound.

Three patterns I won't skip for any autonomous AI agent in production:

  1. Stateful drafts — agent proposes, you decide.
  2. Stop flag — kill everything in 3 seconds.
  3. Hard rate limits in code, not in prompts.

None of this is complicated. But without it, even a well-calibrated agent is a black box acting on your behalf.