Re: [RFC PATCH net-next V2 0/6] XDP rx handler

From: Jason Wang
Date: Thu Sep 06 2018 - 01:12:48 EST




On 2018å09æ06æ 01:20, David Ahern wrote:
[ sorry for the delay; focused on the nexthop RFC ]

No problem. Your comments is appreciated.

On 8/20/18 12:34 AM, Jason Wang wrote:

On 2018å08æ18æ 05:15, David Ahern wrote:
On 8/15/18 9:34 PM, Jason Wang wrote:
I may miss something but BPF forbids loop. Without a loop how can we
make sure all stacked devices is enumerated correctly without knowing
the topology in advance?
netdev_for_each_upper_dev_rcu

BPF helpers allow programs to do lookups in kernel tables, in this case
the ability to find an upper device that would receive the packet.
So if I understand correctly, you mean using
netdev_for_each_upper_dev_rcu() inside a BPF helper? If yes, I think we
may still need device specific logic. E.g for macvlan,
netdev_for_each_upper_dev_rcu() enumerates all macvlan devices on top a
lower device. But what we need is one of the macvlan that matches the
dst mac address which is similar to what XDP rx handler did. And it
would become more complicated if we have multiple layers of device.
My device lookup helper takes the base port index (starting device),
vlan protocol, vlan tag and dest mac. So, yes, the mac address is used
to uniquely identify the stacked device.

Ok.


So let's consider a simple case, consider we have 5 macvlan devices:

macvlan0: doing some packet filtering before passing packets to TCP/IP
stack
macvlan1: modify packets and redirect to another interface
macvlan2: modify packets and transmit packet back through XDP_TX
macvlan3: deliver packets to AF_XDP
macvtap0: deliver packets raw XDP to VM

So, with XDP rx handler, what we need to just to attach five different
XDP programs to each macvlan device. Your idea is to do all things in
the root device XDP program. This looks complicated and not flexible
since it needs to care a lot of things, e.g adding/removing
actions/policies. And XDP program needs to call BPF helper that use
netdev_for_each_upper_dev_rcu() to work correctly with stacked device.

Stacking on top of a nic port can have all kinds of combinations of
vlans, bonds, bridges, vlans on bonds and bridges, macvlans, etc. I
suspect trying to install a program for layer 3 forwarding on each one
and iteratively running the programs would kill the performance gained
from forwarding with xdp.

Yes, the performance may drop but it's still much faster than XDP generic path.

One reason for the drop is the device specific logic like mac address matching which is also needed for the case of a single XDP program on the root device. For macvlan, if we allow attach XDP on macvlan, we can offload the mac address lookup to hardware through L2 forwarding offload, this can give us no performance drop I believe. The only reason that was introduced by XDP rx handler itself is probably the indirect calls. We can try to amortize them by introducing some kind of batching on top. For the issue of multiple XDP program iterations, for this RFC, if we have N stacked devices, there's no need to attach XDP program on each layer, the only thing that need is the XDP_PASS action in the root device, then you can attach XDP program on any one or some stacked devices on top.

So the RFC is not intended to replace any exist solution, it just provides some flexibility for having native XDP on stacked device (which is based on rx handler) and benefit from exist tools to do the configuration. If user want to do all things in the root device, that should work well without any issues.

Thanks