These are the ip configurations of the interfaces on the servers.
eth0 ( 192.168.1.100, running Bind9 and listening on this IP)
eth0 ( 192.168.2.101)
Server B issues a DNS lookup to Server A and gets a timeout. I didn’t have this problem with another multi-homed machine running RHEL5.
I did a tcpdump -i any host 192.168.2.101 and port 53 on Server A and saw that packets are indeed coming to come Server B, but there are no return packets. Bind is definitely running fine. The problem definitely has to be due to RHEL 6 and caused by asymmetric routing.
A Google search for asymmetric routing issues on RHEL6 gave me the answer immediately.
In RHEL5, rp_filter is disabled. So packets can be routed via another interface in another Layer 3 domain i.e. eth0 source ip on Server A can answer to Server B via routing rather than going through eth1 in the same broadcast domain.
In RHEL6, rp_filter is enabled, so Server A checks the routing table and finds that eth1 is the optimal route to Server B. Trouble is that the request from Server B arrived from eth0, so rp_filter kicks in and drops the packet silently!
The immediate solution is to set the rp_filter to 2 on Server A, which is Loose Checking mode. I edited /etc/sysctl.conf and changed net.ipv4.conf.default.rp_filter = 1 to net.ipv4.conf.default.rp_filter = 2
I like to be very explicit when defining configurations, so I added the following too.
net.ipv4.conf.eth0.rp_filter = 2
net.ipv4.conf.eth1.rp_filter = 2
The configuration in /etc/sysctl.conf makes the setting permanent after reboots. For realtime change, do
echo 2 > /proc/sys/net/ipv4/conf/eth0/rp_filter
echo 2 > /proc/sys/net/ipv4/conf/eth1/rp_filter
More information on rp_filter or Reverse Path Filter:
https://www.redhat.com/archives/rhelv6-list/2011-January/msg00080.html (Google: “rhel 6 asymmetric routing”)