[rfc | patch 0/6] netpoll: add support for the bonding driver
From: Jeff Moyer
Date: Fri Jul 01 2005 - 18:08:14 EST
The following patch series provides netpoll support for the bonding driver.
The best way to describe the approach taken is to walk through how it
works. We'll use netconsole as our example.
Netconsole registers a console, and so release_console_sem will trigger the
write_msg implementation in netconsole.ko. This, in turn, calls
netpoll_send_udp. netpoll_send_udp will then call netpoll_send_skb, which
then calls into the network drivers hard_start_xmit routine. It is
important to note that netpoll_send_skb takes care to not call the
hard_start_xmit routine if netif_queue_stopped(dev) returns true. In this
case, it instead calls netpoll_poll, to hopefully free up some tx
descriptors. After this is done, and the queue is woken up, the
hard_start_xmit routine can and will be called.
For all network drivers up to this point, this was the end of the line.
Netpoll would no longer have to be concerned with the skb. It will be
sent off to the network by the driver. However, this is not the case with
the bonding driver.
In the case of the bonding driver, we have now called its hard_start_xmit
routine, which can be one of many, depending on the configured bonging_mode
(specified at module load time). For all of my testing, I have used
dynamic link aggregation (802.3ad). So, in this case, we will call
bond_3ad_xmit_xor. In all cases, the hard_start_xmit routine selects the
proper slave device over which to send the packet, and fills in the
skb->dev pointer to point at that device. Then, they all send the skb off
bond_dev_queue_xmit will then send the skb off to the networking layer via
dev_queue_xmit. dev_queue_xmit is then responsible for calling the
underlying network device's hard_start_xmit routine. Note that
dev_queue_xmit cannot be called with interrupts disabled.
And so, we bring our discussion back to netpoll. In the case of netpoll,
we cannot have the bonding driver call into dev_queue_xmit. For
netconsole, we have disabled irqs before calling netpoll_send_skb (and we
may even be called from interrupt context, in the case of sysrq-X). This
is not the only problem with allowing the bonding driver to call
dev_queue_xmit. We also have to deal with the fact that the underlying
network device may not be ready to send. As in the case described above,
we may be short on tx descriptors. In such a case, we need to be able to
call the driver's poll_controller routine to clear them out. And so, we
have to modify bond_dev_queue_xmit to call *back* into the netpoll code.
The call chain will look something like this:
netpoll_send_udp <----- skb->dev points to bond0
netpoll_send_skb <------ skb->dev now points to real_dev
(may have to call dev->poll_controller, and perform a
napi poll here)
I wrote a test module to exercise the receive code path, as well. The
module registers a struct netpoll with an rx_hook. It then responds to
each packet it receives with a "PONG". For this case, our call path now
looks as follows:
netpoll_send_skb <-- skb->dev points at bond0
netpoll_send_skb <-- skb->dev points at real_dev
And here it gets tricky. If we are sending out over the same interface
on which the packet arrived, then we will have to either queue the packet,
or drop it. This functionality is already implemented in netpoll, so if we
filled in our queue routine to point at netpoll_queue, then at this point
we will call that via the ->drop pointer in our netpoll structure. So,
continuing along, we have:
and we are done. If the packet was schedule to head out a different
device, then we would continue with the send path:
New netpoll function implemented by the network drivers:
This is required, since the bonding device has to walk through each slave
and point its slave_dev->npinfo at the npinfo for the master device. The
reason for this is so that when we're doing the napi polling, we can set
the rx_flags appropriately.
This routine is required since, otherwise, there is no way to intercept
packets bound for interfaces that are not ready for them. Of course, it
requires further logic in the bonding driver to then call into the
netpoll_send_skb routine (which is a new export).
Note that neither of these pointers has to be filled in by the driver.
These functions should only be implemented where needed, and to date, that
is only in the bonding driver.
Newly exported are:
This is exported so that the bonding driver can queue a packet to be sent
via the real ethernet device it has chosen.
This is a new routine that was created and exported so that the
poll_controller implementation in the bonding driver could poll each of
the underlying real devices without duplicating all of the logic that
exists internally to netpoll already.
To test this, as I mentioned above, I wrote a simple module which, upon
receipt of any packet, sends out a packet with the message "PONG". I fired
up netcat to send test packets, and receive the responses. I also loaded
the netconsole module for the very same interface, bond0, and issue a
series of sysrq-X's, both via sysrq-trigger and via the keyboard. I did
this while simultaneously testing the PING server on an SMP machine. As
things stand, it is very stable in my environment.
And so, the patch set follows. Any and all comments are appreciated.
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/