why is it taking so long to change interfaces?

From: Christopher Friesen (cfriesen@nortelnetworks.com)
Date: Fri Oct 13 2000 - 11:43:17 EST


I'm trying to figure out why swapping from one NIC to another is taking
so long (on the order of a few seconds), and I was hoping one of you
could help me out.

I'm running a bit of an odd setup, so bear with me for a minute while I
explain. Machine A has two NICs each set to a different IP address but
on the same subnet (192.168.1.0). (Note: I've also tried it with them
on different subnets to start with, and it makes no difference.) They
each go to the same switch. Also on that switch is machine B with one
NIC, set to 192.168.0.10. Initially, eth0:0 on machine A is aliased to
192.168.0.20. (I know this is an odd setup with two subnets on the same
wires, but there is a reason for it and it seems to work. Also, I do
realize the issues with routing for dual nics on the same subnet, so
don't concentrate on that, please.) Both machines are running linux
2.2.17.

Anyways, I then run ping from 0.10 to 0.20. This works just fine. I
then execute the following commands in a script on machine A:

ifconfig eth0:0 down (removes the alias, but leaves eth0 running fine)
ifconfig eth1:0 192.168.0.20 (set up a new alias on the other NIC)
send_arp eth1 [apropriate arguments to send out a gratuitous arp]
route

This returns immediately, tcpdump shows the arp being sent out
immediately, route shows the new route to 192.168.0.0 going through
eth1, and tcpdump shows the ping requests come in on eth1, but the first
few ping replies still go out on eth0 and it takes a few seconds before
the ping replies go out on eth1. This baffles me--the route to the
192.168.0.0 network is through eth1 now, so why would they go out on
eth0?

Going back to eth0 its the same thing--packets come in on the new alias,
but still go out the old interface for a few seconds.

Does anyone have any idea why it is taking so long for the packets to
get sent out the new route? I've got the cards set to 1.0 by default but
in this scenario there is no traffic on 1.0, and the active NIC is
aliased to 0.0 to ensure that the 0.0 packets only go out on that card.

Here is what my routing tables look like initially, then after a change
to eth1, then after the change back to eth0:

# route
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
192.168.1.0 * 255.255.255.0 U 0 0 0 eth0
192.168.1.0 * 255.255.255.0 U 0 0 0 eth1
192.168.0.0 * 255.255.255.0 U 0 0 0 eth0
127.0.0.0 * 255.0.0.0 U 0 0 0 lo
# ./change
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
192.168.1.0 * 255.255.255.0 U 0 0 0 eth0
192.168.1.0 * 255.255.255.0 U 0 0 0 eth1
192.168.0.0 * 255.255.255.0 U 0 0 0 eth1
127.0.0.0 * 255.0.0.0 U 0 0 0 lo
# ./change2
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
192.168.1.0 * 255.255.255.0 U 0 0 0 eth0
192.168.1.0 * 255.255.255.0 U 0 0 0 eth1
192.168.0.0 * 255.255.255.0 U 0 0 0 eth0
127.0.0.0 * 255.0.0.0 U 0 0 0 lo
#

So, in all cases there is only a single interface that should be used to
access the .0.0 subnet from machine A. But what I'm seeing is that for
a few seconds after the change or change2 command, the incoming packets
come in the new interface (as they should) but outgoing packets go out
the old interface.

Anyways, if anyone could suggest where in the routing code this delay is
coming from and maybe how to avoid it, I would greatly appreciate it.

Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sun Oct 15 2000 - 21:00:25 EST