Public decision log

Every meaningful engineering, business, and product decision we make — with the reasoning, the alternatives we rejected, and what we'd do differently. Updated when something interesting happens.

Why this exists

Most agencies show you the polished case study AFTER the work. We show you the messy middle — the trade-offs, the dead ends, the moments we got it wrong. Public decision logs are the highest-trust form of credibility for technical buyers. They're also a forcing function for better thinking: writing down 'why we picked X' makes you actually have to pick, and defend.

Why Next.js 16 (and not 15 or 14) for ai-whisperers.org

Decision

Used Next.js 16.2.4 for the new build, with Turbopack + App Router.

Reasoning

We needed 4-locale SSG with sub-millisecond TTFB, content-driven (not CMS-driven), and Docker Swarm deployable. Next 16's `params: Promise<{...}>` async API is the right tradeoff: it forces correct async handling and unlocks streaming SSR later. Next 15 has the same shape but still in development. Going back to 14 would have meant missing 2 years of perf improvements.

Alternatives considered
Considered: Astro 5SvelteKitRemixNext.js 14 (Pages Router)Plain Vite + React Router

Astro is great for content sites but our dynamic pricing + case-study pages would have needed client islands anyway. SvelteKit is faster but the team has 5 years of React muscle memory. Remix is React 18 only. Vite + RRR is 3x more code to maintain. Pages Router is dead.

What went wrong

First deploy had 4 pages using the old sync `params: { lang: string }` shape. They 500'd in the prerender pass. Took a manual sweep across all 22 apps in the monorepo to find them all.

What we learned

A grep one-liner saved hours: `grep -rE 'params\s*:\s*\{\s*[\w]+\s*:\s*string' apps/*/app/`. Should have run that before the first build.

Why Docker Swarm (and not Kubernetes) for the Paragu-ai fleet

Decision

All 42+ client sites run on a single VPS with Docker Swarm, Traefik v3.5, and CF orange-cloud proxy in front.

Reasoning

We have 1 engineer (me) running 42 client sites. K8s would mean a cluster, control plane upgrades, RBAC, ingress controllers, cert-manager, etcd backups — that's 3-5 days/month of yak-shaving per month. Swarm has the same declarative model with one binary, one `docker stack deploy`, zero cluster management. The downside is no auto-scaling, but we don't need it — the entire fleet peaks at ~30% CPU.

Alternatives considered
Considered: K3sNomadPlain docker-composeVercel/Netlify per siteCloudflare Pages per site

K3s still needs cluster maintenance. Nomad has a smaller community. docker-compose can't do zero-downtime deploys. Vercel/Netlify would mean 42 separate bills and 42 separate build pipelines. CF Pages requires CF Workers build system, less flexible.

What went wrong

First 12 sites I deployed had a static Traefik config in /opt/traefik/dynamic/ that took precedence over Swarm labels — caused 502s for 6 hours before I caught it. Now I have a rule: labels at .Spec.Labels (service-level), NEVER static dynamic files unless documenting why.

What we learned

Write the failure mode into the deploy script itself. `paragu-ai-platform-maintenance` skill captures the exact deploy flow + the static-dynamic file trap. Future-me (or anyone else) can't repeat the mistake.

Why we publish pricing publicly (when 95% of Paraguayan agencies don't)

Decision

All 28 services with reference prices, market low/high, and internal proof — on a public Google Sheet + on the company site.

Reasoning

When I started AI Whisperers, the #1 buyer objection was 'how much?'. The #1 buyer complaint after hiring us was 'I had no idea it would cost that'. Both come from the same root: hidden pricing. Publishing rates up-front filters out clients who can't afford us (good — wrong fit) and earns trust from clients who can (good — long-term ROI). The risk is competitors undercutting us. The counter-risk is looking like every other opaque agency in Paraguay.

Alternatives considered
Considered: Quote-only (industry default)Tier-only ("Starter/Pro/Enterprise")Hourly-only with discovery call

Quote-only means a 30-min call for every prospect, even the ones who can't afford us. Tier-only is the worst — it's hard to compare and lets us hide the actual rates. Hourly-only is technically transparent but psychologically worse: a $5K quote looks like 50 hours × $100, which is intimidating.

What went wrong

Initial pricing page had 'starting from' qualifiers everywhere, which is the same problem as no pricing. The current page shows: (1) the actual reference price, (2) market low/high for context, (3) what's included, (4) what changes the price up or down. Total transparency.

What we learned

Price opacity is a defensive posture from agencies that compete on charm, not competence. If your work is good enough, publishing the price is a *selling point*, not a risk.

Why content lives in JSON files in the repo (and not in Sanity / Contentful / Strapi)

Decision

All site copy is in `content/{en,es,nl,pt}/site.json` + per-section JSON files. No CMS. Git is the source of truth.

Reasoning

For a 4-locale, 44-page site with 28 service items, the CMS overhead doesn't pay back. Git gives us: (1) versioned content, (2) PR review on copy changes, (3) Claude/Cursor can edit JSON in seconds, (4) no vendor lock-in, (5) no separate deploy pipeline. The downside is non-technical users can't edit — but we have 2 people, both technical, and the client doesn't need to edit copy (they sign off on it before deployment).

Alternatives considered
Considered: SanityContentfulStrapi (self-hosted)Decap CMS (git-backed)Markdown files in repo

Sanity/Contentful cost $200+/mo and add a separate API to debug. Strapi is just more YAML config to maintain. Decap CMS is good but adds a runtime. Markdown would work but JSON is more structured for our use case (pricing tables, testimonials, etc.).

What went wrong

First version had nested JSON 6 levels deep that was a nightmare to grep. The v2 schema flattens arrays and uses consistent keys. Now any copy edit is a 5-second PR.

What we learned

For 2-person teams, the CMS is a layer that exists to solve problems you don't have. Add it when you have a content team that isn't engineering. Until then, JSON + Git is faster, free, and reviewable.

Why we rebuilt ai-whisperers.org from scratch (and didn't iterate on the Vercel build)

Decision

Built a fresh Next.js 16 + Turbopack app, deployed to our own VPS, killed the Vercel deployment. Same DNS, different backend.

Reasoning

The old Vercel build was the 'Machines for the math. Humans for the magic.' tagline — circa 2025. By 2026 we'd moved past the AI-training pitch into the AI-engineering pitch. The old tagline was hurting us in sales calls. Iterating on the Vercel build would have meant: editing React components, hitting Vercel's API to redeploy, dealing with the Vercel-only edge functions, etc. Starting fresh on our own VPS gave us: full control, no per-function billing, security headers, Traefik middlewares, content-driven rebuild. Took 4 days end-to-end.

Alternatives considered
Considered: Iterate on Vercel buildAstro on VercelHugo static site

Vercel iteration kept the wrong architecture. Astro would have meant rewriting every interactive component. Hugo would have meant no SSG with per-locale content. Fresh Next on VPS won on all axes.

What went wrong

The DNS cutover (Squarespace NS flip) is still pending — the old Vercel build is still serving public traffic. We're ready, just waiting on a 5-min browser task. Until then, this whole audit-and-fix cycle is invisible to the public.

What we learned

The build is the easy part. The DNS delegation is the friction. Next time, do the DNS migration FIRST, even before the build is ready.

Want to dig deeper?

Every decision links to its source material: PRs, audits, GitHub commits, customer conversations. Build in public means the receipts are public too.

View the build changelog