AWS NAT Gateway Goes Regional: High Availability Without the AZ Juggling

Blog ENG - AWS - Post 9 2025

For as long as I can remember, NAT Gateways in AWS have been a necessary piece in VPC design. We’ve built them per Availability Zone (AZ), placed them in public subnets, managed route tables per AZ, and repeated this pattern every time our workloads grew or spread out. It worked, but it wasn’t exactly elegant, especially in environments where elasticity and speed trump manual plumbing.
With the new regional availability mode for NAT Gateways, AWS has taken a big swing at simplifying this story. You can now create a single NAT Gateway that automatically expands and contracts across AZs in your VPC based on where your workloads actually run. That means high availability is handled for you, and the setup is significantly easier.
Below is my take on why this matters, how it works conceptually, what patterns it unlocks, and a practical migration checklist from someone who’s spent years stitching together cloud networks for real-world architectures.

The Pain We’ve Lived With (And Why This Changes Things)

Yesterday’s model:

  • NAT Gateways were zonal. You typically deployed one per AZ for resilience and performance locality.
  • Each NAT Gateway lived in a public subnet, and private subnets in that AZ routed outbound traffic to the local NAT Gateway.
  • If your workloads expanded into a new AZ (think Auto Scaling, spot capacity, EKS nodes, or DR drills), you had to:
    • create another NAT Gateway
    • add/manage a new public subnet
    • update the private route tables
    • maintain scaling and lifecycle consistency across zones.

Today’s regional mode:

  • You create one NAT Gateway per VPC (regional availability).
  • No public subnet required: the NAT Gateway handles multi-AZ presence dynamically.
  • As your workloads appear (or disappear) across AZs, the gateway automatically expands and contracts to maintain resilience and locality.
  • It supports both AWS-provided IP addresses and Bring Your Own IP (BYOIP), which is a big deal for enterprises and regulated environments.
  • It’s available in all commercial AWS Regions, except AWS GovCloud (US) and China Regions (at the time of this writing).

This release finally aligns NAT with how we expect regional services to behave: one logical object, resilient across zones, minimal operational overhead.

How It Works (Conceptually)

  • You choose regional when creating the NAT Gateway for your VPC.
  • The gateway identifies which AZs are “active” for your workloads and extends its presence accordingly.
  • Your private subnets route outbound traffic to the regional NAT Gateway, and the service ensures egress stays highly available and local to your workloads, minimizing cross-AZ hops.
  • If your workloads expand into a new AZ, the NAT Gateway automatically adapts: you don’t have to provision additional gateway instances or tweak routing logic.

To make the difference tangible, here’s a side-by-side comparison of the traditional zonal pattern and the new regional mode:

Side‑by‑side diagram comparing Zonal NAT (one NAT per AZ in public subnets) versus Regional NAT (one gateway expanding across AZs), including private subnet route tables and IGW egress.

Diagram A (left): A VPC (10.20.0.0/16) with per‑AZ NAT Gateways hosted in public subnets; each AZ’s private subnet uses its local NAT for default routes, and NATs egress via an Internet Gateway (igw‑1234).

Diagram B (right): A VPC (10.20.0.0/16) with a single Regional NAT Gateway that automatically presents across AZs; private subnets in each AZ point default routes to the regional NAT (nat-1234), and the gateway’s managed route table egresses via the same Internet Gateway (igw‑1234).

Why It’s a Big Win

1. Operational Simplicity
Fewer NAT Gateways, fewer subnets with special roles, fewer conditional routes. Your IaC templates get simpler, your runbooks get shorter, and your time-to-deliver gets faster.

2. Elasticity Without Friction
Auto Scaling across AZs stops being a networking event. Your scaling policies can do their job without a ticket to “add NAT in AZ C”.

3. High Availability by Design
No more manual HA orchestration: resilience is embedded in the service’s regional presence.

4. BYOIP Support
If you have IP management requirements (policy, compliance, allow-listing with external partners), BYOIP lets you preserve your IP identity while gaining simplicity.

5. Cost Model Alignment
While specifics depend on your region and traffic profile, using a single regional gateway may reduce the number of hourly NAT Gateway resources you run (data processing charges still apply, as always). The more you previously multiplied NAT Gateways across AZs, the more likely you’ll see simplification (potentially cost and operational).

Migration: A Practical Checklist

If you’ve already deployed per-AZ NAT Gateways, here’s a pragmatic approach you can use with clients:

1. Inventory Your Current Footprint

  • List existing NAT Gateways, their AZs, associated public subnets, and private route tables.
  • Map which private subnets route to which NAT.

2. Decide Where Regional NAT Belongs

  • Create a plan per VPC: one regional NAT Gateway per VPC that needs outbound internet connectivity from private subnets.

3. Design Routing Normalization

  • Update private subnet route tables to point to the regional NAT Gateway (replacing zonal gateways).
  • Confirm there’s no dependency on NAT GW IDs embedded in scripts or health checks.

4. IP Strategy (Optional, But Important)

  • Choose between AWS-provided addresses or BYOIP, depending on compliance and external allow-lists.
  • If you use BYOIP, plan IP assignment and inventory those ranges so teams know what identity they’re egressing with.

5. Phased Cutover

  • Update one AZ at a time to validate connectivity.
  • Observe NAT data processing and route behavior during peak traffic windows.

6. Decommission Zonal NAT Gateways

  • Remove old NAT Gateways and their public subnets (if they existed only for NAT).
  • Clean up stale route table entries and any NAT-associated Security Group artifacts (if you used helper constructs).

7. Observability & Cost Tracking

  • Tag the regional NAT Gateway and set up dashboards/alarms for data processing volume and error rates.
  • After 1-2 billing cycles, compare costs to ensure the new pattern matches expectations.

8. Update IaC & Runbooks

  • Simplify templates (CloudFormation/Terraform) and operational playbooks to reflect regional NAT as the default.
  • Document the exceptions (e.g., regions not supported; special routing cases).

Design Tips from the Field

  • Avoid Cross-AZ Hairpinning: Verify subnet routing so there’s no unexpected cross-AZ detour, especially if you use custom routing or niche egress requirements.
  • Boundary Clarity: Keep the NAT Gateway’s role focused on outbound internet. Don’t overload it with hybrid traffic patterns that require NAT-specific policies beyond its scope; use dedicated appliances or services where appropriate.
  • BYOIP Governance: If you adopt BYOIP, integrate it with your IPAM processes. Treat egress IP identity as a product with consumers: security, partners, and application owners.
  • Testing in “Edge” Scenarios: Validate behavior during AZ impairments, scaleouts, and blue/green deployments. Observe how fast the regional gateway presence adapts and that your routes remain deterministic.

Security and Observability Considerations

  • Security Groups vs NACLs: NAT Gateway is a managed service and doesn’t attach SGs like EC2 instances. Your subnet-level controls (NACLs) and instance SG configurations remain the core levers.
  • Threat Modeling: Centralizing egress identity simplifies allow-listing but also means visibility is consolidated. Ensure your inspection layers (VPC Flow Logs, traffic mirrors, egress filters) capture the story.
  • Tagging and Ownership: Tag your regional NAT Gateways with clear ownership to reduce ambiguity in shared services models.

Closing thought

After two decades of helping teams design resilient, scalable networks, I see regional NAT Gateway mode as a natural evolution: it elevates egress from AZ plumbing to a regional capability that matches how modern platforms operate. The payoff is straightforward: fewer moving parts, faster delivery, and resilience by default. If your private subnets rely on outbound internet, make the regional NAT Gateway your new baseline. Then invest your saved energy where it matters most: policy, segmentation, observability, and cost governance.