Why I keep human approval gates even when I could remove them
I run 15 automated tasks. Five of them will never get auto-publish — even though I could technically remove the approval gate in an afternoon.
It's not that I don't trust the system. It's that judgment about a specific person isn't something I delegate.
I figured this out after week one with an autonomous LinkedIn agent. I'd assumed approval gates were temporary — training wheels until the agent "learned." Turned out they're a permanent architectural decision. One of the few I got right before paying for the lesson.
"Remove humans from the loop" is a reflex, not a decision
This was the dominant advice throughout 2024. Remove routine from your schedule, free your attention, hand the repeatable work to AI. I agree with 80% of it. I genuinely removed myself from ten workflows — from JSONL log formatting to digest assembly. It works.
But I noticed "remove humans" often means "remove any checkpoint that slows you down." That's different. Some slowdowns are engineering problems to fix. Others are deliberate judgment points. Conflating them is how you end up with an agent doing things you'd never have approved manually.
When I went through my 15 tasks systematically, two categories emerged. Structural tasks — where the output is fully determined by the input. And judgment tasks — where the right answer depends on context the system doesn't have access to.
Two categories: structural and judgment
A structural task looks like this: "Take yesterday's JSONL log, format it as Markdown, drop it in Inbox." The algorithm is deterministic. The error is reversible — re-run it, get the same output. A human here is overhead.
A judgment task looks like this: "Write a comment on Maxim's post about React Native." The system knows templates. It doesn't know Maxim sent me a complaint three days ago. It doesn't know the tone of that specific post calls for a different register. It doesn't know the timing is off.
The tricky part: both tasks look identical from a code perspective. Both take input and return output. The difference only shows up when you think about consequences.
Five gates I keep — and why they stay
First — outgoing LinkedIn connection requests. The system can find relevant people, draft a personalized message. Technically ready. But I press send. The reason: I watched LinkedIn ban two accounts in my network within three days of automating their connection requests. 47 requests in a week instead of 5-7 per day — the platform flags that pattern. Since reinstating the gate: zero account issues in six months.
Second — final publish on long articles. The agent writes the draft, runs it through a humanizer pass, checks SEO. I read the final output with my own eyes. Not hunting for factual errors — the agent handles those. Looking for tone drift. Over three months, the gate caught four pieces I shouldn't have published: one where I'd slipped into analyst voice instead of practitioner; one with a paragraph that could have hurt a specific person.
Third — replies to non-standard comments. Standard comments ("Thanks!", "Interesting, how did you...") the agent handles solo. Complex ones — disputes, complaints, ambiguous irony — get tagged needs-review and land in my Inbox. I deal with those by hand. The failure mode here is reputational, not technical.
Fourth — references to real client cases. When the agent pulls facts from past projects, it doesn't know whether the client agreed to public disclosure. The gate catches this. Without it, I'd have shipped at least one post with metrics a client considered confidential.
Fifth — changes to the automation system itself. Tasks that modify the logic of other tasks never run autonomously. The agent produces a diff, I review it, and I confirm the change. One bug in a cron task can cascade through eight others.
What breaks when you remove a gate too early
Week one without the LinkedIn connection gate: three days suspended, one cold outreach to a potential client completely derailed. The math: 47 requests in seven days instead of the planned 35-49 across the week. LinkedIn read it as automation. It was automation — just without oversight.
There's a subtler failure mode. One draft would have gone live with "we at the studio do X." Not catastrophic. But that's not my voice — I write in first person, from personal experience. The agent didn't notice. I did.
A gate isn't about distrust. It's about recognizing that some categories of error cost more than the time it takes to review.
When a gate earns its removal
Three conditions, all three at once:
First — the task is purely structural. Identical inputs always produce the same type of output. Formatting, logging, template assembly — yes. Publishing on my behalf — no.
Second — the error is reversible. If the agent puts a file in the wrong folder, I can move it. If the agent publishes a post with wrong tone, the reputational damage is already done.
Third — I've run the task manually twenty-plus times and never found a case where context changed the decision. This isn't about iteration count. It's about variety of situations encountered.
All three present — the gate comes down. Any one missing — it stays.
HITL as architecture, not training wheels
When I say I've kept approval gates, I sometimes hear: "You just don't trust your system." That's not it. I trust it exactly in the parts where it's deterministic. Where it isn't — I keep the judgment.
This doesn't make the system slow. Structural tasks run without me around the clock. Gates exist only at the points where the cost of an error exceeds the cost of five minutes of my time.
Human-in-the-loop isn't a phase you graduate from. It's a deliberate architectural choice about where the boundary between machine and human sits in your specific system.
More on how I structure production patterns for AI agents — three trust patterns from 13 cron jobs. On what to log and alert so you actually know what's happening — monitoring 13 autonomous agents. On writing task specs that work without human clarification — a separate piece on autonomous agent specs.