All Posts

Why I Run Keaz on Docker Swarm, Not Kubernetes

Why I picked Docker Swarm over Kubernetes for Keaz's twelve services in production — the three signals that say switch, and the budget math that says "not yet."

8 min read
docker

Keaz is a WhatsApp marketing platform with about twelve services in production today. It runs on Docker Swarm, not Kubernetes, and that is a deliberate choice I would make again. The current consensus says Kubernetes is the default for any "real" SaaS — Portainer's 2026 review estimates it powers over 70% of enterprise container workloads. That number is real for enterprises. It is misleading for small SaaS. For most teams under thirty services and under thirty engineers, Docker Swarm gives you the orchestration features you actually use, with a fraction of the operational overhead, and Kubernetes is a tax you pay before you have the problem it solves.

Why this matters now

Four founders have asked me about container orchestration in the last quarter. Three already had Kubernetes. None of them needed it. The conversation always starts the same way: "We picked it because we thought we'd grow into it." That sentence quietly burns about ten hours of engineering time per month in a small team — sometimes more — long before any of the things Kubernetes is good at start to matter.

There is also a budget shift this year that makes the wrong default more expensive. Anthropic's Claude Agent SDK is moving to a separate billing pool on June 15, 2026 — a public sign that AI compute is going to keep eating the line item that used to fund a platform engineer. Founders weighing whether to hire someone to babysit Kubernetes are going to look at that line item harder this quarter than they did last year.

The technical case for boring orchestration

The argument is not that Kubernetes is bad. The argument is that for a small SaaS, the orchestration features you actually use are the ones Docker Swarm already gives you for free.

Swarm mode is built into the Docker Engine. There is no separate installer, no extra binaries, no etcd cluster to operate. The official Swarm docs describe it as a declarative service model that maintains desired state, performs rolling updates with configurable parallelism and delay, enforces TLS mutual authentication between all nodes, assigns a stable DNS name to each service, and load-balances internally across replicas. Those are exactly the primitives a small SaaS needs.

On top of that, you get a compose-file format that the same engineers already write for local development. The mental gap between "running locally" and "running in production" is small enough that one engineer can hold the whole topology in their head. That single property — the entire production topology fits in one engineer's head — is more valuable to an early SaaS than any individual feature Kubernetes ships.

Kubernetes makes a different trade. Its core thesis is that the control plane is worth running because, at scale, you stop wanting any single engineer to hold the whole topology in their head — you want declarative manifests, controllers, operators, and abstractions so many teams can ship into the same cluster without stepping on each other. That thesis is correct for scale. At ten services and four engineers, it is overkill in a way that costs you real money.

Speaking of money: a managed Kubernetes control plane on EKS is $0.10 per hour during standard support and $0.60 per hour during extended support. AWS itself documents that over a 26-month version lifecycle, you pay an average of $0.33 per hour per cluster — about $240 per month, before nodes, before load balancers, before the operational time tax of running upgrades, before the engineer-hours of keeping CRDs and operators current. None of that buys you anything Swarm cannot do at zero extra cost for a small fleet.

The business cost of picking the wrong default

The Kubernetes tax is paid in three places, and only one of them is the AWS invoice. The bigger two are headcount and time-to-ship.

Headcount. Kubernetes is a platform you operate, not a tool you use. The honest version of "we run on Kubernetes" is "we have at least one engineer whose primary job is keeping Kubernetes working." For a Series A SaaS, that is half a hire — sometimes a full one — that you could otherwise spend on a third backend engineer.

Time-to-ship. Ramp time on Kubernetes for an engineer who has not seen it before is measured in weeks. Ramp time on Swarm for the same engineer is measured in hours. A founder who tells me they want to ship faster and also defends a Kubernetes choice they cannot tie to a concrete need is internally inconsistent, and I tell them so.

The implication is simple. If your goal is shipping product faster than your competition, your container orchestrator should be the boring thing you stop thinking about. For most early SaaS, Swarm gets out of your way. For most early SaaS that adopted Kubernetes by default, Kubernetes becomes the thing they spend a disproportionate amount of time talking about.

A three-signal rule for switching

When I'm advising a founder, I use a three-signal rule. Switch from Docker Swarm to Kubernetes when, and only when, one of these is true.

  1. You have hired — or have budget approved for — a dedicated platform or SRE engineer. Without that role, you are using your product engineers to run Kubernetes, and they will resent it. With that role, the abstractions Kubernetes provides start to pay for themselves.
  2. You need autoscaling driven by something other than HTTP request volume. Swarm has decent scaling for the common cases. The moment you need queue depth, custom metrics, or scheduled pods that spin up against a Prometheus signal, Kubernetes' ecosystem — KEDA, custom HPAs, the metrics-server surface — genuinely becomes the easier path.
  3. You have a hard multi-region requirement that crosses cloud providers. Swarm can do multi-region, but the multi-cloud, multi-region story — federation, service mesh, cross-cluster identity — is something the Kubernetes ecosystem has invested in and Swarm has not.

None of those signals is true at Keaz today. Two of them were partially true at Lyska, where I was employee #2 and built the multi-tenant foundation for a B2B e-commerce platform — and we still ran on Docker for years before any signal was strong enough to justify a rewrite. The default should be "the simpler thing that already runs your service in production."

My perspective

I have shipped on both sides of this. At Klimado I run ESG and CSRD compliance SaaS in production — three apps in nine months, orchestration deliberately boring. At breathing.ai I ran roughly 220,000 lines of code across three products in ten months — same approach. The pattern I keep seeing is that the teams who picked the boring orchestrator shipped more product per quarter than the teams who picked the prestigious one.

The argument against Swarm is usually that you will "outgrow" it. In practice, the teams I have watched outgrow Swarm did so because their business worked. The migration to Kubernetes at that point is meaningful work, but it is meaningful work in service of a real problem. Doing it preemptively is solving for a problem you may never have, with money and time you definitely do.

The contrarian thing I will say plainly: Kubernetes is the wrong default for an early-stage SaaS in 2026. Pick it on the date one of those three signals goes green, not before.

Recommended action this quarter

If you are pre-Series A or just past it and considering orchestration, default to Docker Swarm and revisit it every six months against the three-signal rule.

If you already adopted Kubernetes and none of the three signals are true, you do not need to migrate back — but you do need to take honest stock of how much engineering time you are spending on it, and consider whether a managed offering can shrink the operational surface enough to make the cost worth it.

Either way, write down the conditions under which you would change orchestrators. The biggest cost is not picking either tool. It is picking one without ever stating what would make you reconsider. If you want a second opinion from someone who has run both in production, a fractional CTO engagement is the fastest way to get one.

Sizing your orchestration choice?

If you are choosing between Docker Swarm and Kubernetes — or already on Kubernetes and wondering whether it is paying for itself — book a time. Half an hour saves a quarter of operational drag.