The mid-market migration scramble
In the observability space, we spend our days monitoring query latency, CPU spikes, and memory leaks. But as organisations scale, the most critical bottleneck is often not hardware — it is the fine print of a vendor's Service Level Agreement.
A recurring pattern in 2026: a business builds on a developer-friendly cloud, scales successfully, then hits a wall when enterprise clients demand a strict 99.99% (“Four Nines”) availability guarantee. The team discovers — too late — that their current provider's managed database is contractually capped at 99.95%.
Moving from 99.9% to 99.99% is the boundary where simple architecture meets enterprise compliance. It is not a configuration change — it is a cloud provider switch.
The math behind the “nines”
When designing resilient systems, you are designing a strict time budget for failures. The table below shows what each tier actually allows in practice.
| SLA Level | Uptime % | Max downtime / year | Max downtime / month |
|---|---|---|---|
| Three Nines | 99.9% | 8.77 hours | 43.83 minutes |
| Three & a Half Nines | 99.95% | 4.38 hours | 21.92 minutes |
| Four Nines | 99.99% | 52.60 minutes | 4.38 minutes |
| Five Nines | 99.999% | 5.26 minutes | 26.30 seconds |
Notice the leap from 99.95% to 99.99%. A 99.95% SLA allows over 4 hours of database downtime a year. A 99.99% SLA gives you less than an hour. If a single failover event takes 5 minutes to resolve, a 4-nines architecture only gives you a budget for one such event per month.
The cloud platform showdown
When orchestrating a multi-cloud or migration strategy, you must look closely at how different providers architect their fault tolerance — and crucially, which configuration options are required to unlock the 99.99% contractual guarantee.
Disclaimer: All provider examples and SLA figures in this article are based on publicly available documentation as of May 2026 and may change over time. MonitorGiant is an independent vendor — this comparison is not sponsored by, affiliated with, or endorsed by any cloud provider mentioned, and is intended purely as engineering guidance, not legal or commercial advice. Always review the latest contracts with your own legal and vendor representatives.
Value and sovereign clouds
These platforms excel in simplicity and raw compute-to-cost ratios, but require careful evaluation when SLAs become legally binding.
Standard cloud SLAs sit at 99.9%, allowing ~9 hours of downtime per year. Neither offers a fully managed 4-nines DBaaS. Achieving true HA here means manually orchestrating your own clusters (e.g., Patroni for Postgres), shifting the operational burden onto your team.
A strong European sovereign cloud option. Managed Databases on Enterprise tiers offer up to 99.99%, making OVHcloud a viable path if you need EU data sovereignty without sacrificing SLA.
The hyperscalers
If you need a guaranteed 99.99% managed database, the Big Three are your primary targets — but your specific configuration choices dictate your contractual SLA, not just the provider.
| Provider | Product | SLA | Requirement to reach 99.99% | Notes |
|---|---|---|---|---|
| AWS | Aurora (Multi-AZ) | 99.99% | 3-AZ replication by design | Standard RDS Multi-AZ tops out at 99.95% |
| GCP | Cloud SQL Enterprise Plus | 99.99% | Multi-zone HA must be enabled | Standard edition only yields 99.95% |
| Azure | SQL DB / Flexible Server | 99.99% | Zone-Redundant HA (cross-zone) | Same-Zone HA only yields 99.95% |
| OVHcloud | Managed Databases Enterprise | 99.99% | Enterprise tier plan required | Best EU data-sovereignty option at 4-nines |
| DigitalOcean | Managed Databases (HA) | 99.95% | HA cluster enabled | Hard ceiling — cannot reach 99.99% |
| Hetzner / Hostinger | Cloud VMs / DBaaS | 99.9% | Manual Patroni cluster for HA | No managed 4-nines DBaaS offering |
The mid-market database trap
The most common catalyst for an emergency cloud migration happens when a team realises their managed database provider has a hard ceiling.
DigitalOcean is an excellent platform for rapid deployment and simple scaling. However, its Managed Databases SLA caps HA clusters at 99.95%. For the web tier — Droplets and Load Balancers — you can architect around failures. But if the persistence layer is contractually bound to 99.95%, your entire application is mathematically incapable of offering a 4-nines guarantee to enterprise customers.
When you hit this ceiling, a cloud migration is no longer optional — it becomes a business imperative driven by a sales deal, not an engineering preference.
The SLA of a distributed system is only as strong as its weakest link. If the database ceiling is 99.95%, your 99.99% SLA promise is void regardless of what the rest of your stack achieves.
Architecting the cutover: zero-downtime migration
Migrating a live, mission-critical database to a 4-nines provider is routine when orchestrated correctly. You cannot take the application offline for hours. Instead, rely on asynchronous streaming replication as the bridge.
Logical Replication — The Bridge
Set up logical replication (Postgres pglogical, AWS DMS, or pgoutput) between your source database and the new target. The target stays in read-only mode, trailing the primary by milliseconds. No application downtime at this stage.
pglogical · AWS DMS · pgoutput · Debezium
Dual-Writes — The Safety Net
For hyper-critical write paths, temporarily update the application tier to write to both the source and the target simultaneously. Validate that checksums and row counts match before proceeding.
Application-layer write fanout · Checksum validation · Row-count reconciliation
The DNS Flip — Cutting Over
Once replication lag hits zero and data integrity is confirmed, drop the source, flip the application connection string to the new target, and remove the dual-write logic. Use a low TTL DNS record (60 s) set 24 hours in advance to minimise propagation risk.
Connection string swap · Low-TTL DNS · Blue/green deployment
Upgrading your cloud provider is step one. Ensuring your observability stack and routing architecture are ready for the transition is what actually keeps the lights on. During cutover, you need real-time replication lag monitoring, automatic alerting on write failures to the dual-write target, and a rollback plan that still uses the source as primary.
Frequently asked questions
What is 4-nines (99.99%) uptime in real terms?
A 99.99% SLA allows a maximum of 52.60 minutes of downtime per year, or 4.38 minutes per month. If a single database failover event takes 5 minutes, a 4-nines architecture gives you a budget for only one such event per month.
Which cloud providers offer a managed database SLA of 99.99%?
AWS Aurora (Multi-AZ, 3-zone replication by design), Google Cloud SQL Enterprise Plus (multi-zone HA enabled), Azure SQL Database with Zone-Redundant HA, and OVHcloud Managed Databases on Enterprise tiers. Standard configurations on DigitalOcean, Hetzner, and Hostinger are contractually capped at 99.9%–99.95%.
How do you migrate a database to a 4-nines provider with zero downtime?
Use logical replication (pglogical, AWS DMS) to stream changes to the new target while it trails the primary in read-only mode. Optionally enable dual-writes for critical paths. Once replication lag is zero and data is validated, flip the connection string and low-TTL DNS record to the new target.
The takeaway
The 4-nines threshold is not a number you tune your way to — it is an architectural decision that starts with choosing a cloud provider whose managed database product contractually supports it. Identify the ceiling early, plan the migration before an enterprise deal forces your hand, and instrument the cutover with real-time replication lag monitoring so you have a safe rollback path.
The observability layer is not an afterthought in this process. It is the mechanism that tells you when replication lag is safe to flip, when dual-writes are diverging, and when your new 99.99% provider actually delivers on its contract after go-live.
Written by
Dileep KK, MonitorGiant
LinkedIn21+ years in IT infrastructure management and observability. Built monitoring dashboards, custom alerting pipelines, and AI token-tracking systems across cloud platforms — AWS, GCP, and Azure — and for organisations spanning defence IT, IoT manufacturing, digital marketing, SaaS email, insurance broking, parliamentary digital services, and educational ERP. Active directory, SIEM, WAF, Cloudflare, MSSQL, Linux, Windows, Entra ID — operated at every layer of the stack.