We had 2 ISPs.
Primary went down. Secondary was UP.Still, internet was not working.
On paper, this looked like a perfect design:
2 ISPs
SD-WAN profile for path selection
Automatic failover expected
Reality: When Primary failed, traffic had nowhere to go.
Why?
Because Secondary ISP was never advertising a default route to the firewall.
So even though the link was UP,
the firewall had no route to the internet.
What went wrong (design mistake):
We assumed:
“If SD-WAN is configured, failover will just work.”
But SD-WAN only selects between existing routes.
It does NOT create routes.
No route = no forwarding = no internet.
The real root cause: We never tested Secondary in isolation.
Primary was always healthy,
so Secondary stayed “theoretical HA”.
Until the day it became production.
Architect takeaway:
High Availability is not about having backup links.
It’s about proving backup paths actually work.
If you’ve never:
Pulled the primary cable
Or disabled the primary route
Then your secondary is not HA.
It’s just hope with an interface.
No comments:
Post a Comment