We Spent Three Days on AI Tooling Setup. Here's the Engineering Decision I Should Have Made First.
A colleague spent three days getting GitHub Copilot running for five developers. Everyone's VPN config differs. Billing fails through certain payment methods. The corporate account needs a foreign card. After three days: two developers work with AI, three don't. The productivity gap shows up within a week.
I went through exactly this in 2024. There's one decision I should've made on day one.
Why this is an engineering problem, not an HR problem
The typical response is administrative: find a corporate card, negotiate with the vendor, send everyone a "how to connect" guide. This works for one tool in one location. For a distributed team it creates a zoo — each developer on a different Cursor version, different Copilot tier, different prompting style, different limits.
The problem isn't billing friction. It's the absence of an architectural decision. Who decided which toolchain to standardize? Which API key goes into CI/CD? What does a junior do when their subscription hasn't been provisioned yet?
These are engineering questions. They deserve an engineering answer — made once, not revisited every time someone joins the team.
Three approaches and why I chose one
The first option is per-person GUI subscriptions: Copilot, Cursor, Windsurf, each developer picks their own. Everyone's happy with their tool. There's no shared standard, costs scale linearly with headcount, and none of it integrates into CI/CD.
The second is self-hosted via Ollama: Llama 3.3, Mistral, no external requests, no billing. I tested this on real Bitrix PHP tasks. There's genuine uplift — about 20-30% on routine work. Claude 3.5 Sonnet via API delivers 60-70%. The gap is visible on real code, not on hello-world benchmarks.
The third is API-first via proxy: one API key for the team, a proxy server, all developers connect through it. Any editor works — Continue.dev in VSCode or JetBrains, Cursor in API mode.
I chose the third. Here's why.
API-first in practice: Continue.dev + Claude API
Continue.dev is an open-source IDE extension that works with any LLM through an OpenAI-compatible API. One config.json in the repo — all developers get the same configuration on git pull.
The setup is straightforward:
- An Nginx proxy on the server forwards
/v1/requests toapi.anthropic.comwith an injected key - Developers point
config.jsonathttp://your-proxy.com/v1/— no token needed locally - CI/CD runners use the same key directly
Cost for five developers in active daily use: roughly $50-70/month on Claude API. Five Cursor Pro subscriptions: $100-150/month, with no shared standard.
The more important thing than cost: Context Files in Continue.dev map to a CLAUDE.md in the repository. One file defines what the model knows about the project — Bitrix architecture, PHP style guide, forbidden patterns. Every developer who clones the repo gets this automatically. Prompt drift is minimal.
What I ruled out
Self-hosted LLMs (running locally via Ollama) eliminate billing friction entirely but trade it for quality and maintenance overhead. I didn't drop self-hosted entirely — it runs locally for experiments. But for production PHP/Bitrix work where I need consistent quality, local Llama doesn't keep up. Specifically: reviewing a typical Bitrix component, generating test coverage, refactoring while preserving the public interface. Claude via API is more reliable on complex tasks. Ollama's genuinely useful for fast autocomplete on simple constructs — latency is lower there.
Corporate GitHub Enterprise with complex account arrangements: tried it. Three weeks of administrative overhead, limited context customization, no direct API access from CI. The API-first path was working in two days.
AI in CI/CD without a GUI
API-first AI tooling integrates into CI/CD pipelines without requiring developer GUI subscriptions. This is the argument for this approach that rarely gets mentioned.
AI code review in GitHub Actions is a curl call in a CI script. Pass the diff, get comments back. It works regardless of where each developer lives, which editor they use, or whether their subscription is active. The CI runner runs in Germany, the developer works from Moscow — the setup is identical.
Same with test generation: a pre-commit hook script calls the API, generates a test class skeleton for any new public method. No GUI required at all.
I currently have three such scripts in the workflow: diff review on PR, baseline test generation, brief changelog entry. All via API, all idempotent, all logging to JSONL.
What didn't work
Continue.dev with JetBrains IDEs was unstable for a few weeks — it crashed the indexer under large context. Fixed by updating the plugin and capping maxTokens in config. Not critical, but cost two days to diagnose.
Routing different models through one proxy — Claude for complex tasks, GPT-4o for faster responses, local Ollama as fallback — sounds reasonable in theory. In practice, developers started choosing models manually, prompting styles diverged, and the shared standard dissolved. I removed the choice and fixed the team on one model.
One decision instead of three weeks of firefighting
An engineering decision made at the start — API-first, Continue.dev, shared Context Files in the repository — is cheaper and more reliable than re-solving the same problem for every developer who joins.
This isn't about navigating payment friction. It's about the team having a shared AI environment, the same way you have a shared .editorconfig or a shared phpstan.neon. AI tooling is no different.
The question "how do we set up AI tools for the team" is now closed. It works for five developers, it scales without administrative overhead, and it shows up in CI/CD without anyone having to think about it.
*Related: I Write a Spec Before Asking AI to Write Code · What I Never Give AI: A Framework From 13 Production Tasks*