Building Software Like a Builder

Stop vibe coding. Merge a 'preconstruction mindset' with spec-driven development and harness agentic AI to ship software that actually works. Templates and checklists inside.

October 17, 2025•Updated November 4, 2025•9 min read

The $548 Million Lesson

I learned to ship software on a muddy construction in Silverdale, Washington.

Hanging the WSU flag on a tower crane at St. Michael Medical Center in Silverdale, WA

That medical campus was built on a budget of $548 million. My job was coordinating the invisible systems like reverse osmosis pipes, pneumatic tubes, redundant power—that would keep people alive. You learn fast that "we'll figure it out later" kills budgets. And patients.

Today, as a Chief of Product who ships production code using AI, I watch developers make the exact mistake that would've gotten me fired from that jobsite: starting work without a plan.

Andrej Karpathy has a name for it: "vibe coding." You kinda sorta know what you want, so you prompt an AI, get 800 lines of code in seconds, then spend two days debugging code you didn't design and don't understand.

It's the coding equivalent of pouring foundation before checking if you're building on sand.

The fix isn't avoiding AI. It's bringing the preconstruction mindset to software. In construction, preconstruction is sacred—it's where you catch the $10 problems before they become $10 million lawsuits. It's where every trade aligns on exactly what gets built, in what order, with what materials.

This approach is formally referred to as Spec-Driven Development. Anthropic built it into Claude Code. GitHub calls it "the golden workflow." Martin Fowler's site dedicated 5,000 words to it.

The Vibe Coding Epidemic

Last month, a founder messaged me in panic. His Next.js app—built in two weeks with Cursor's help—just generated a $3,100 Vercel bill. For a todo app with 200 users.

The culprit? His AI-generated code:

Image optimization on every request (not cached)
API routes fetching entire tables into memory
No pagination—loading all records for every user
A sitemap.xml regenerating dynamically (378,000 requests = massive bandwidth)

"But it works perfectly on localhost!" he said.

This is vibe coding's illusion of velocity. McKinsey claims AI makes developers 2x faster at "coding tasks." But speed was never the bottleneck. Gergely Orosz nails it: the real bottlenecks are clarifying requirements and reviewing code.

Vibe coding makes both worse.

When you prompt-paste-and-pray, you're not building—you're accumulating. Every prompt creates new code that might work but doesn't fit. GitLab found AI-assisted projects have 7-8x more code duplication. Google's research shows AI reproduces known vulnerabilities in 16% of generated functions.

The scariest part? It feels productive. You're "shipping features." You're "moving fast." But you're building on quicksand. That founder moved to a $5/month VPS. Same app, 600x cheaper.

He rebuilt with a spec. Took three days.

On a construction site, we have a saying: "Measure twice, cut once." In software, we've inverted it: "Generate infinitely, debug forever."

There's a better way.

Think Like a Project Manager

In construction, preconstruction isn't optional—it's where 80% of the project's success is determined. Before anyone touches a hammer, we:

Define success (blueprints, not vibes)
Identify dependencies (concrete before framing)
Anticipate failures (what if it rains for two weeks?)
Lock the scope (change orders cost millions)

This is exactly what your AI needs.

Birgitta Böckeler, writing for Martin Fowler, calls it Spec-Driven Development: "The spec becomes the source of truth for the human and the AI." Not prompts. Not chat history. A spec.

Here's the insight construction taught me: specs aren't about perfection—they're about permissions. When every trade has the same blueprint, the electrician knows exactly where they can drill without hitting plumbing. The spec gives permission to work independently without coordination overhead.

Same with AI. A good spec separates the stable "what" from the flexible "how":

The "what": User needs auth. Tasks sync across devices.
The "how": NextAuth vs Clerk. Postgres vs MySQL. REST vs GraphQL.

Lock the "what" in a spec. Let AI iterate on the "how."

Instead of "build me an app," you give AI a bounded problem: "Implement M1 from the spec: Google auth that persists sessions and returns user data at /api/me."

The AI can't wander. Can't over-engineer. Can't create eight different auth patterns. It has one job, clearly defined, with acceptance criteria.

The construction industry has learned this lesson the hard way. The Sydney Opera House was supposed to cost $7 million and take four years. Without proper preconstruction, it cost $102 million and took 14 years.

Your zero-to-production journey doesn't need to be your Sydney Opera House.

The Spec Template That Ships

docs/SPEC.md

Project Spec

Goal

For: Busy people who lose track of tasks
Problem: Tasks scattered across devices; no fast, reliable sync
Success: TTI < 2s, task ops < 150ms, p50 sync < 800ms, DAU > 100 by week 2

System Components

Entities: User { id, email }, Task { id, userId, title, status, dueAt }, List { id, userId, name }
Actions: auth:signInWithGoogle, task:create|edit|complete, list:create|share
Boundaries: Web (Next.js) / API (FastAPI) / DB (Postgres) / Cache (Redis)

Constraints

Stack: Next.js + FastAPI + Postgres + Redis; Auth via OAuth (Google)
Libraries: Auth (NextAuth), ORM (Prisma/SQLModel), Logging (Pino), Telemetry (OTLP)
Non-negotiables: API contracts in openapi.yaml; migrations versioned

Acceptance Criteria (examples)

Complete Task: Click moves task to Done in <1s; API returns 200; UI reconciles
Sync: New task on device A shows on device B within 800ms (p50)

Work Packages (1–2 days each)

M1: Auth + User model (Google sign-in, session persists, /me endpoint)
M2: Tasks CRUD (create/edit/complete works; unit tests pass)
M3: Realtime sync (2-device test meets p50 < 800ms)
M4: Share links (public route guarded, no edit access)

Risks

Auth vendor lock-in → Wrap auth in adapter; keep schema portable

Open Questions

Roles needed? Offline-first required?

The magic isn't the template. It's the discipline. When you force yourself to write "API returns 200 with user object within 150ms," you're forced to think about performance before writing code. When you document "OAuth only, no password storage," you prevent AI from creating security holes.

Most importantly, when your spec says "M1: Auth" and your code does auth, you can ship it. No scope creep. No "while we're at it." No vibe-driven drift.

The Three-Phase Loop That Works

Phase 1: Spec It (Preconstruction)

Don't touch your IDE. Open a markdown file. Write what success looks like:

Who's the user?
What's their problem?
What does "working" mean? (Specific. Measurable. "Task creates in <150ms" not "fast")
What are the work packages? (1-2 day chunks)
What breaks the build? (Security, performance, compatibility)

Tip

Pro tip for non-technical founders: Use AI against itself. Get Claude to write a spec, then ask GPT-4 to tear it apart like a cynical staff engineer. The fight between AIs surfaces your unknown unknowns.

Phase 2: Build It (Construction)

This is where discipline pays off:

Branch per package

git checkout -b feature/M1-auth

Give AI the spec + only that package

"Implement M1 from SPEC.md: Google auth with session persistence"

Review every file (you're the building inspector)

Test the acceptance criteria

npm run test:auth

Commit small, commit often

git add . && git commit -m "feat: implement M1 auth per spec"

The risk management here is deliberate:

Lower probability: Clear specs = less AI confusion
Lower impact: Small packages = isolated failures
Higher detectability: Acceptance criteria = immediate feedback

When building that medical campus, we didn't install all plumbing then all electrical. We built in vertical slices—one floor, all trades, fully functional. Same principle here.

Phase 3: Ship It (Post-Construction)

The "punch list" phase. Real users find real problems:

That button that's invisible on mobile
The query that takes 3 seconds with real data
The feature nobody actually uses

This is victory. Users who care enough to complain are users who want you to succeed. Their annoyance is your roadmap.

Note

Critical insight: Don't debug vibe code. You'll never understand it well enough. If user feedback reveals fundamental problems, spec the fix and rebuild that slice. It's faster than archaeology.

The Sydney Opera House metaphor breaks down here. Software isn't concrete—you can rebuild a component in hours, not years. But only if you built it with a spec the first time.

What Actually Changes

I learned this the hard way with my first Cursor project. The tool had just launched, and I was intoxicated—full-stack apps were finally within reach. No plan beyond "build a chat app." Just weeks of frantic vibe coding.

The first days felt euphoric. Firebase auth? Working. Database? Connected. WebSockets? Somehow functioning. I was 99% done.

Then deployment hit.

What worked perfectly on localhost became an impossible puzzle in production. I burned through hundreds of dollars in Cursor credits, spent another month untangling routes, fixing what vibe coding had created. The app that took two weeks to "build" took six weeks to actually ship.

The breakthrough came with Claude Code's plan-then-execute architecture. Suddenly, the pattern clicked: specs aren't bureaucracy—they're blueprints. Now when I build:

Deployment works first time (requirements defined upfront)
Features ship in days, not months (work packages = predictable scope)
Every line has purpose (spec requirement = code existence)

But here's the real change: you stop being afraid of your codebase. Even if you don't technically "understand" every line the AI generates, you understand why it exists. When changes are needed, you update the spec first, then the code follows.

The AI becomes a master craftsman working from your blueprints, not a chaos generator following vibes.

The construction industry figured this out decades ago. Now they're applying our agile methods to their preconstruction phase. We should return the favor and apply their planning rigor to our AI workflows.

Your domain expertise—the problems you understand better than any Silicon Valley engineer—is your advantage. With the right workflow, you can build solutions that matter.

Stop vibe coding. Start building.

By David Leer