Sharing one of the IPSec troubleshooting session on a Palo Alto firewall that took longer than expected due to multiple overlapping issues.
Issue was reported as “tunnel up but application traffic not passing”. At first glance, both IKE and IPSec SAs were up and stable. No immediate red flags on the dashboard. Started with basic checks: Verified proxy IDs on both sides, matched. Checked Phase 1 and Phase 2 crypto profiles, encryption and hashing were aligned. SA lifetime values were identical. Moved to traffic validation: Traffic was matching the correct security policy. No drops observed on pre-NAT or post-NAT stages. Session was getting created, but packets were not encapsulating into ESP. Next step was to check routing behavior: Found that the return path for some subnets was resolving via a different virtual router due to a broader static route. As a result, traffic was asymmetric and ESP packets were exiting via the wrong interface. Fixed routing and revalidated: Tunnel started passing traffic for a few minutes, then broke again. Checked system logs and noticed intermittent CHILD_SA delete messages during rekey. On comparing detailed IKE debug logs, found PFS mismatch. Local firewall was configured with PFS group20, while peer was proposing group14 only during rekey, not during initial negotiation. Aligned PFS settings on both ends and cleared IKE and IPSec SAs. Post-fix validation: Tunnel remained stable across rekey intervals. Traffic was encapsulating correctly. No further delete or negotiation failures observed in logs. Final takeaway from this session: When multiple issues exist simultaneously, fixing one can temporarily mask the other. IPSec problems are rarely isolated to a single configuration line. Routing, rekey behavior, and tunnel monitoring all need to be evaluated together.My name is Rakesh and saying I am a huge nerd would probably be an understatement.. I love technology and getting my hands into the CLI or trying something new. I started this page because I've had a lot of people ask for help with some of the things I've either deployment in my professional career or at clients.
Wednesday, 4 February 2026
Subscribe to:
Post Comments (Atom)
Why do many Palo Alto engineers open a TAC case immediately… without checking anything first?
A production issue happens. Application team says “network issue.” Users say “firewall problem.” And within minutes someone says: “Let’s ope...
-
The TCP connection setup behavior for a Standard virtual server operates as follows: the three-way TCP handshake occurs on the client si...
-
1. Restoring the BIG-IP configuration to the factory default setting Impact of procedure: This procedure removes all BIG-IP local traffic o...
-
Problem this snippet solves: Next article describes an upgrade procedure to perform only using CLI commands. The idea is not to rep...
No comments:
Post a Comment