While working on a cool VMware NSX project we have discovered a bug within the Edge Firewall when using a “Logical Switch” as a “Destination”.
Lets first start off with making troubleshooting easy and adding the “Rule Tag” and “Log” information to the Firewall view.
Now lets look at the Edge Firewall config itself, we’ve got two rules:
- A rule that allows RDP from Server “srv” towards logical switch “Web” (so all VM’s attached to logical switch “Web” should be reachable via RDP)
- A rule that blocks all other traffic, the default any any any deny rule.
So when we created this rule we had one VM (VM-A) attached to logical switch “Web”
RDPing to VM-A works fine and results in the log entry displayed below.
Note that the ID corresponds to the “Rule Tag”-number within the Edge Firewall, that’s why we enabled this view on first hand and also note that this is being logged because we selected “Log” on the corresponding Firewall Rule.
Whenever we deploy a second VM (VM-B) to logical switch “Web” and try RDPing to VM-B we noticed that this didn’t work and the log displays a DROP that corresponds to Rule Tag 135201, which in our case is default the any any any deny rule. This basically shows that the Edge Firewall skips our RDP Accept rule.
Somehow it looks like VMware is using a static VM-to-Logical-Switch table because whenever you change something random to the firewall and publish these settings, RDPing towards VM-B start to work and consequently doesn’t work for VM-C if it gets connected after the Firewall has been published.
After reaching out to VMware Support it appears that they aren’t aware of this behaviour and then can actually reproduce this in their own environment. For now we need to wait and see what VMware comes up with.
I’ll keep you posted on this.
Update 22-09-2015: VMware Engineering reported that this is a bug within version 6.1.4 and that, while not described in the release notes, it is solved in NSX 6.2