Home
Microsoft

Wednesday 1 September 2021

URPF mode NSX-T with VMware Cloud Foundation Not configured automatically

I am writing this post for everyone who is planning to use VMware cloud foundation with NSX-T Tier-0 Active-Active configuration.

Well its a short post and will not take long.

In my recent engagement, we completed the VMware cloud foundation management domain bring-up with AVN but without BGP.

I have posted about bring-up without BGP topic in a separate post.

Below image shows the connectivity diagram. (dummy values).

Post bring-up completion we started with the link failover testing, but even before we could start it we saw that from a VM hosted on a NSX-T overlay segment if we ping all our uplink switches physical IPs we are only able to ping one. It made me realize that something isn't deployed correctly. 

But as we all know that VCF does everything on its own and making changes directly to the product may break VCF hence we didn't take chances, but opened a case with VMware GSS.

As expected VMware GSS confirmed that URPF mode is not set to none by VMware cloud foundation workflows, it needs to be changed manually!!

Hence to save some time for you, I am writing this post. When you deploy VCF, do make a point to set URPF mode to none on each edge node interface using NSX-T manager console.

If you are wondering what is URPF then please read this article once Understanding Unicast Reverse Path Forwarding

Steps to set URPF mode to none are listed below.

1) Login to NSX-T datacenter console.


2) Navigate to Networking and select Tier-0 gateways.

3) Now click on three dots against tier0 gateway.

4) Click on Edit, and navigate to interfaces.

5) Click on 4 and on interfaces screen click on three dots against first interface and select edit.

6) Set URPF mode to none scroll down and save.

7) Repeat same steps for all remaining interfaces.


I hope I was able to add value, if your answer is yes, then don't forget to share and follow. 😊

If you want me to write on specific content or you have any feedback on this post, kindly comment below.

If you want, you can connect with me on Linkedin, and please like and subscribe my youtube channel VMwareNSXCloud for step by step technical videos.

3 comments:

  1. I would be careful providing a blanket statement suggesting to disable uRPF. It has its place, and is generally preferred to be kept on.



    Unless you have a true asymmetric routing issue, this should be left to default = strict. Rather than simply disabling uRPF, the basic checks and balances should be performed to ensure that uRPF is indeed discarding frames due to an asymmetric routing topology.



    You may be masking bad config by disabling uRPF!

    ReplyDelete
    Replies
    1. You are correct, however thats why I have mentioned when you have ECMP and uplink switches are configured with HSRP or VRRP, packets gets discarded if same path isnt followed. GSS guy I worked with suggested that he has already submitted a report for KB creation for the same. How soon I am not sure. But trust me its not a blanket statement, its based on use case and exact issue description that you only get a response from one switch.

      Delete
  2. Hi, I believe I can see what your issue is here (and I see this post is several years old), but if you'd indulge me with just a bit more info.

    From your diagram, it looks like you've configured two instances of HSRP, one for the 172.27.12.x/24 network and another for 172.27.13.x/24. Based on this, I presume that you have static routes on each TOR that point to the T0 SR interfaces directly, correct? Also, I would presume that these static routes are of equal cost, so you can utilize ECMP N-to-S for traffic flows from the TORs to the T0s ,right?

    If this is correct, I think you have bumped into what I'd consider a "corner case" for uRPF. When you ping, say 172.27.12.2, from your VM inside NSX, it obviously routes up to one of the T0s, and then is routed out that particular T0s interface that is attached to the 172.27.12.x network. We know it will use this interface because this is a connected network; it would trump your default route on the T0 (I'm guessing you are using a default static route as you have HSRP configured on your TORs).

    So, 172.27.12.2 gets the ICMP echo request, and now wishes to send an ICMP echo reply. Here's where things are interesting. If you have ECMP routes configured on the TOR to return traffic to the VM in NSX, the TOR could the 10.27.12.x interface on the attached T0 SR *or* it could use the 10.27.13.x interface on the other SR; both are valid paths back as far as the TOR is concerned.

    However, if the packet does come back in on a T0 SR 172.27.13.x, we have a problem; a uRPF check will result in the T0 dropping this packet. It would do so because it has a connected route for the 172.27.12.x network (it's 172.27.12.x uplink interface), yet from it's perspective, it's received an ICMP echo reply sourced from 172.27.12.2 (the TOR) on it's 172.27.13.x uplink interface. This is what would violate the uRPF lookup, and cause the drop.

    You won't/shouldn't experience this issue with any other test outside of pinging IP addresses on those local 172.27.12.x and/or 172.27.13.x networks, as traffic from other networks (let's pretend you have physical servers on the 172.30.10.0/24 network) won't violate this; the T0 SR will see these packets, and presuming you have ECMP default routes on your T0, you'll never trip uRPF. It's just your pinging your TOR interfaces static routing configuration that has this potential (barring devices on these same 2 networks that use the TOR as their gateway; in that case, the same potential issue applies).

    I hope this is of use; if I'm correct, with this in mind, you should be able to test/vet this out and verify that it is your problem. I think a few packet captures would likely demonstrate this issue.

    Regards,

    Michael

    ReplyDelete

Popular posts