Inter‑hub connectivity in Azure: why I now reach for Route Server first

Blog ENG - MS Azure - Post 3 2025

If you’ve ever tried to scale a multi‑region hub‑and‑spoke in Azure and keep the routing tidy, you know the feeling: UDR sprawl, brittle inter‑hub paths, and a never‑ending game of “which route did I break this time?”. Over the last few years, Azure Route Server (ARS) has become my go‑to tool to make inter‑hub connectivity boringly reliable (and that’s a compliment).
ARS lets your NVAs and gateways speak BGP with the fabric, so routes show up where they should without hand‑editing dozens (or hundreds) of tables.
Below I’ll share how I design inter‑hub connectivity with ARS, the trade‑offs between two common patterns, and a few pitfalls I wish someone had warned me about.

The baseline I start from
When I say “inter‑hub”, I typically mean two regional hubs (each with their local spokes), globally peered for east‑west traffic, and each hub connected to on‑prem via two ExpressRoute circuits in a bow‑tie for HA. Prefer the local circuit (via BGP attributes/Weight) and don’t use ExpressRoute for VNet‑to‑VNet (use peering on Microsoft’s backbone); it’s lower latency and removes a dependency on peering locations. Also note: creating or deleting ARS in a VNet that already has a VPN/ER gateway causes up to 10 minutes of downtime, so schedule it.

A couple of platform guardrails worth keeping in your head as you design:
– Route advertisement to ER: A VNet gateway can advertise up to 1,000 IPv4 prefixes to an ExpressRoute circuit. If you plan to leak many spokes to on‑prem, you’ll want summarization and/or filtering.
– ARS scale limits: One ARS supports 8 BGP peers and up to 4,000 routes per peer (and 10,000 total across sources). Plan your peering strategy accordingly.
– VNet peering scale: A VNet can peer with 500 VNets by default; with Azure Virtual Network Manager (AVNM) connectivity configs, you can push that to 1,000. This matters if you place ARS in a hub that many spokes must peer with.

Pattern #1: ARS and NVAs co‑located in each hub (the “classic” approach)

When I use it:
You want full inspection for all east‑west and north‑south flows, and NVAs already sit in the hubs.

How it works in practice:
– Each regional NVA peers with its local ARS to learn local spokes and with the remote ARS to exchange remote spokes. This lets ARS program next hops directly into NIC‑level routes on the other side (no UDRs on NVA subnets just to reach the far hub).
– In the spokes, I still use a route table with a single 0.0.0.0/0 → local NVA and set Propagate gateway routes = Disabled. That keeps inspection deterministic and prevents spokes from learning “creative” paths via remote NVAs.
– Because both ARS instances default to ASN 65515, you must enable AS‑override on the remote side so BGP loop prevention doesn’t drop your spoke prefixes at the remote ARS.
– For symmetry to and from on‑prem in this full‑inspection variant, I add specific UDRs in GatewaySubnet pointing local spoke prefixes to the local NVA. Keep an eye on route table entry limits as your spoke count grows.

Why it scales (and where it doesn’t):
This design dramatically reduces bespoke inter‑hub UDRs because ARS injects the right next hops for you (that is the big operational win). But you still concentrate control in the hub: ARS has an 8‑peer limit, and GatewaySubnet UDRs can become numerous if every spoke must be enumerated for symmetry. It’s workable at medium scale; at very large scale, I often switch to the “Transit VNet” variant.

Notes from the field:
– Path selection to on‑prem: In a bow‑tie ER setup, on‑prem edge devices will see the same Azure spokes from both regions. They typically prefer the shorter AS‑path, which is what you want; keep local traffic local. You can tune this with AS‑prepend if needed.
– Branch‑to‑branch: If you need ARS to exchange routes with your ER/VPN gateway (for true “branch‑to‑branch”), turn that on explicitly in ARS configuration.

Pattern #1B: Reduce on‑prem inspection (selective bypass)

When I use it:
You want full inter‑hub inspection but don’t require sending spoke→on‑prem through the NVAs.

What changes:
The remote NVA advertises a supernet for its spokes to the local ARS. Spokes still have a default route to the local NVA for inter‑hub traffic, but on‑prem follows the system route via the local ER gateway (bypassing the NVA). This cuts down BGP peerings and GatewaySubnet UDRs.

Trade‑off:
You keep inter‑hub inspection, simplify ARS peering counts, and avoid GatewaySubnet route bloat—at the cost of not inspecting Azure→on‑prem in the hub NVAs. Be intentional about that.

Pattern #2: ARS in the hub, NVAs in a Transit VNet (my scaling favorite)

When I use it:
You want full inspection and you’re pushing hub limits (peering counts, UDRs), or you want more control over what gets advertised to ER.

How it works:
– Place your NVAs in a Transit VNet. Peer that Transit VNet with local spokes, the local hub (to use the ER gateway), and the remote Transit/Hub as needed. The NVAs peer BGP with both local ARS and remote ARS.
– Spokes keep the single default‑to‑NVA UDR and Propagate gateway routes = Disabled for deterministic inspection. ARS will advertise spoke prefixes to on‑prem via ER (with the local ER preferred by Weight/AS‑Path).
– Because both hubs will advertise the same spokes to on‑prem, use AS‑prepend on remote advertisements to ensure inbound traffic prefers the local region; it also behaves well during ER failovers.

Why I like it at scale:
– You no longer need UDRs in GatewaySubnet to preserve symmetry, which removes an operational hotspot.
– You’re not forced to peer every spoke directly with the ARS VNet. Spokes can peer with the Transit VNet instead (handy when you’re approaching the 500 peering limit per VNet and even when expanding to 1,000 with AVNM).
– You gain better control over which prefixes are advertised to ER, which helps you stay under the 1,000‑prefix export limit from Azure to on‑prem.

Implementation tips that save me time
– Disable VNet‑to‑VNet via ExpressRoute: Use the platform toggles to prevent ER from becoming a transit between VNets. Keep inter‑VNet on Microsoft’s backbone via peering.
– Turn on ARS “next‑hop IP” when you run HA NVAs behind an internal load balancer: ARS can program the ILB’s frontend as the next hop, which preserves symmetry in active/passive and simplifies scaling in active/active.
– Mind where ARS injects routes: ARS writes effective routes to NICs in your VNets (and can inject into spokes), so validate with effective routes on a VM NIC when troubleshooting; don’t rely only on the NVA OS table.
– Respect ARS/gateway interactions and maintenance: Creating/deleting ARS in a VNet with a gateway causes a short outage; plan a window and communicate.
– Know your limits (literally): ARS peers (8), per‑peer route cap (4,000), total prefixes (10,000), and ER advertisement caps (1,000 from Azure side). Design your summarization and peering fan‑out with those numbers in mind.

How I choose between the patterns
– Need full inspection for everything, medium scale, and minimal re‑plumbing? Keep NVAs in the hub with ARS (Pattern #1A).
– Want inter‑hub inspection but skip Azure→on‑prem inspection? Use the supernet “bypass on‑prem” variant (Pattern #1B).
– Pushing limits or want cleaner ops at scale? Move NVAs to a Transit VNet and peer them to local/remote ARS (Pattern #2). You’ll reduce GatewaySubnet complexity, keep control of ER advertisements, and avoid hitting peering ceilings in the hub.

Final thought
What I like about ARS‑based inter‑hub designs is not just the dynamic routing, it’s the predictability. Once you wire the BGP relationships correctly and keep ExpressRoute out of VNet‑to‑VNet, the network behaves like you’d expect across regions, even as you add spokes. It’s one of those rare cases where adding a component (ARS) actually reduces complexity.