Back to blog

The Non-Vibecoder's Workbench: How I Structured AI Assistance Without Losing Engineering Control

Vibecoding is when you describe a task, the AI generates code, you check that it looks right, and you deploy. I don't work that way. Not because I'm afraid of AI — Claude Code is open on my screen every working day. It's because without structure, AI in a legacy PHP project isn't acceleration. It's debt with a nice interface.

Here's what my actual workbench looks like in a Bitrix/Next.js e-commerce project. Not philosophy — a concrete harness I've been running for six months.

Why vibecoding doesn't work in legacy PHP

Vibecoding is the practice of describing a task in natural language, accepting the AI-generated code with minimal review, and shipping. The approach assumes transparent context — the AI knows what it needs to know, and the result is roughly correct. In a greenfield project on a clean stack, that assumption sometimes holds.

It doesn't hold for a Bitrix store with 28,000 SKUs, infoblocks (Bitrix's internal content structure), composite page caching, and a decade of business logic in PHP. The AI doesn't know that your PROPERTY_VENDOR_ID iblock property feeds three microservices through a queue. It doesn't know that composite cache invalidates on specific field changes, not all changes. It'll generate working code — and create a bug that shows up two weeks later when the catalog updates.

I've been through this. Not catastrophic — but a week of root-cause analysis and three hours of reverts.

A harness isn't a limitation on AI. It's the line where my responsibility begins.

The three-zone delegation model

A structured AI delegation model means deciding in advance — before touching any task — which category it falls into: full autopilot, mandatory human review, or never delegate. I use three zones.

Zone one is full autopilot. Integration tests for Bitrix REST API endpoints, unit tests on business logic, SQL EXPLAIN analysis, boilerplate code transforms, regex, TypeScript types built from existing API responses, documentation strings. I give Claude full context and review the result the same way I'd review a junior's pull request: read it, understand it, merge it.

Zone two requires mandatory review. New PHP functions in a Bitrix context, React components with business logic, changes to existing API endpoints, data migrations. AI writes the draft — I rewrite the critical parts. Usually about 60% stays as-is, 40% I change.

Zone three is never. Iblock structure changes (adding properties, changing field types), composite page architecture, Elasticsearch index mapping, pricing and discount logic, session handling. If AI suggests something in this zone, I stop and think about why I even asked.

The line between zones isn't about the tool. It's about this question: if this breaks on Saturday night, who figures it out? If the answer is just me — it's zone three.

What a harness looks like in a real PHP project

Task: write an integration test for the Bitrix REST API endpoint catalog.product.list with warehouse filtering.

What I give Claude:

  • The existing test file (three examples of the same test type)
  • Internal documentation for this specific endpoint, including our exact parameter structure
  • A narrow task: "Write a test that verifies that when the filter WAREHOUSE_ID=5 is applied, only products with stock > 0 at warehouse 5 are returned"

What I don't give:

  • General project context
  • The database schema
  • A list of all endpoints

Why? Because the more context the AI gets, the more it tries to help with architecture. Narrow task, narrow answer.

Claude generates the test. I read it. Fix one thing — namespace is different in our test suite. Save. Run. Green.

Total time: 12 minutes instead of 35. But I read every line.

CLAUDE.md as a harness

CLAUDE.md is a file in the repository root that Claude Code reads at the start of every session. It's my main control mechanism in any AI-assisted PHP development workflow.

Three blocks in it.

First: project context. What the project is, the stack, PHP and Bitrix versions, key constraints. "Bitrix composite cache is active — don't suggest changes to page templates without an explicit request."

Second: delegation rules. Written literally: "Don't suggest iblock structure changes without an explicit request. Don't suggest Elasticsearch configuration changes without a schema review." Claude follows this — not perfectly, but in about 90% of cases.

Third: what's forbidden. "Don't add new composer dependencies without explicit discussion. Don't create new database tables." Not because it's technically prevented — because I want to see those decisions explicitly, not get them as a side effect of some other task.

I update CLAUDE.md at the start of each major project. And after every incident where AI did something unexpected. That part matters more than the initial setup.

Six months in: what actually changed

Honest numbers: about 65% of code in recent projects was AI-assisted. Of that, roughly a third I didn't touch after review. The rest I changed from slightly to significantly.

Architectural decisions made by AI: zero. Not because AI didn't try. It did. Several times I caught Claude starting to recommend restructuring an iblock "because it would be more efficient." I stopped it there.

Red flags I watch for:

  • AI suggests renaming a table or adding a new iblock property
  • AI starts explaining "why this architecture is suboptimal" without being asked
  • AI generates code that works but I can't explain why

That last one is the most important. If I read the code and don't understand it — that's not "smart AI." That's debt I haven't recognized yet.

Speed on routine tasks changed most. Tests that used to get deferred ("I'll write those later") now get written immediately — because 12 minutes instead of 35 changes the psychology. That adds up.

Time on architectural decisions didn't budge. AI doesn't help here. Experience does, documentation does, talking to a colleague does.

The harness doesn't make you faster everywhere. It makes you faster where it's safe.

More on the specific delegation rules (what to hand off, what to keep) in What I Never Give AI. The original position: I let Claude write tests. I don't let it choose architecture.