BGP Troubleshooting — Step-by-Step Practical Guide
When BGP fails, traffic stops, routes disappear, and users feel it immediately. Here is a structured method I follow in production networks.
1️⃣ Verify Basic Reachability
BGP runs over TCP port 179. If IP connectivity fails, BGP cannot work.
✔ Ping neighbor loopback / interface
✔ Check routing table for neighbor IP
✔ Verify no ACL or firewall blocking TCP 179
Key idea: No IP reachability = No BGP session.
2️⃣ Check BGP Neighbor State
Run:
Copy code
show ip bgp summary
Look for neighbor state:
• Idle → No TCP connection
• Connect / Active → TCP problem
• Established → Session OK
If not Established, focus on configuration mismatch or network reachability.
3️⃣ Validate Neighbor Configuration
Most BGP issues are configuration mistakes.
✔ Correct neighbor IP
✔ Correct remote-AS
✔ Update-source configured (for loopback peering)
✔ Proper multihop setting (if not directly connected)
Small typo = session down.
4️⃣ Authentication Problems
If MD5 authentication is configured:
✔ Password must match on both sides
✔ Check logs for authentication failure
Mismatch = session resets repeatedly.
5️⃣ Check Route Advertisement
Session up but routes missing? Then check policy.
✔ Network statements present
✔ Route redistribution configured
✔ Route-map / prefix-list not blocking routes
✔ Next-hop reachable
Command:
Copy code
show ip bgp neighbors x.x.x.x advertised-routes
6️⃣ Investigate Route Filtering
Many networks fail because of filtering policies.
✔ Prefix-list direction (in / out)
✔ Route-map deny statements
✔ Maximum-prefix limit reached
Policy errors silently drop routes.
7️⃣ Check BGP Attributes and Path Selection
If route received but not used:
✔ Local Preference
✔ AS Path length
✔ MED value
✔ Weight
✔ Next-hop reachability
Best path selection determines traffic flow.
8️⃣ Monitor Logs and Debug Carefully
Logs give the real story.
✔ Neighbor reset reason
✔ Hold timer expiry
✔ Policy rejection
Use debug only in maintenance window.
9️⃣ Check Physical and L2 Issues
Sometimes problem is not BGP.
✔ Interface flapping
✔ Duplex mismatch
✔ VLAN or trunk issue
✔ High CPU or memory
Transport instability breaks BGP.
🔟 Compare With Working Peer
Best practical trick:
👉 Compare working neighbor vs failing neighbor
👉 Spot configuration differences quickly
Final Tip:
Always troubleshoot in layers → Physical → IP → TCP → BGP → Policy.
This structured approach reduces MTTR and builds strong network stability.

No comments:
Post a Comment