Hetzner CAX: six months of Ampere in production

Six months ago I migrated several production workloads from Hetzner’s x86 CX line to the ARM64 CAX instances. Here is what I have learned.

What works well:

The CAX instances run Ampere Altra processors. For workloads that are not compute-bound — web servers, API proxies, background jobs, small databases — the performance is excellent and the price is roughly 30% lower than equivalent x86 instances.

Docker images built for linux/amd64 run fine under QEMU emulation, but the performance penalty is significant for CPU-intensive tasks. Anything you run regularly should be built for linux/arm64. The ecosystem has mostly caught up: most popular images now publish multi-arch manifests.

Caddy, PostgreSQL, Redis, n8n, and Coolify all run natively on ARM64 without issues.

What doesn’t work:

Ollama on CAX is disappointing. The Ampere Altra does not have the SIMD instruction set that makes LLM inference fast on modern x86 CPUs. For inference workloads, stick to x86 or use Hetzner’s GPU instances.

Some older Python packages with compiled extensions don’t have ARM64 wheels. You end up compiling from source, which is slow and occasionally fails. This is improving but is still a friction point.

The price-to-performance verdict:

For the right workloads — stateless services, background processing, databases under moderate load — the CAX instances are the best value in Hetzner’s lineup. For inference or compute-intensive ML workloads, they are not the right tool.

The practical recommendation: run your web tier and application logic on CAX, run your inference on CCX (x86) or GPU instances. The cost savings on the web tier more than offset the slightly higher cost of keeping inference on x86.