I shipped a production SaaS with zero hand-written code. Here's the workflow.

Not a demo, not a prototype — a deployed SaaS where AI typed every line and I made every decision. The exact spec-generate-review-gate loop, and where it breaks.

I set myself an honest test: take real products — auth, billing, tenancy, the boring hard parts — from empty repository to production without hand-writing a single line of code. GitHub Copilot and Claude did all the typing. I did everything else.

It worked, twice. StampBD is a multi-database SaaS for Bangladesh's stamp vendors — stock tracking, sales, government reporting. Amaz is a complete Amazon seller management platform — inventory, purchase orders, forecasting, returns, multi-warehouse, SP-API integration. But "AI wrote it" is the least interesting sentence about either, because the workflow that made it work is 80% engineering judgment and 20% prompting.

Direction, not absence

The phrase I use with clients: this is AI-accelerated delivery directed by engineering judgment. I architected, reviewed, and shipped; AI typed. Nine years of building production systems the manual way — micro-finance platforms to multi-tenant ERPs — is what makes the directing possible: I know what a correct diff looks like, so I can reject an incorrect one in seconds.

The loop has five gates:

01 spec      (human)  architecture, acceptance criteria, what NOT to build
02 generate  (ai)     Copilot + Claude write the implementation
03 review    (human)  every diff, read like a tech lead reads a team PR
04 gate      (ci)     types, tests, lint — no green, no merge
05 ship      (human)  deploy, monitor, own the incident

The spec is where the leverage lives. A feature starts as a written decision — invitations are workspace-scoped tokens, single-use, 7-day expiry, no silent account creation — plus acceptance criteria. The instruction that changed everything: "flag anything underspecified instead of guessing." AI's failure mode isn't bad code; it's confident code built on a guess you didn't know it made.

What breaks, honestly

Long-range consistency. AI happily creates a second way to do something that already has a first way. The fix is architectural: strong conventions documented in the repo, and review that treats "this duplicates existing structure" as a hard reject.
Schema design. I don't delegate data models. Every table is a human decision, because schema mistakes are the ones you pay for over years.
The last 10% of a hairy bug. When generation loops on a fix, I stop generating and start reading. The judgment to know when to stop prompting is — again — the actual skill.

What this means if you're hiring

Speed without a lowered bar, but only when someone senior owns the gates. An AI workflow directed by someone who couldn't build the system manually produces the same thing fast typing always produced: impressive piles of wrong.

I now run this loop on client work daily. The honest pitch isn't "AI makes it cheap" — it's that the senior engineer you're paying for stops spending their hours typing and spends all of them on the decisions that were always the expensive part.

Direction, not absence

What breaks, honestly

What this means if you're hiring

Dealing with this exact problem?