Re: [RFC PATCH net-next V2 0/6] XDP rx handler

From: Jason Wang
Date: Wed Aug 15 2018 - 03:05:01 EST




On 2018å08æ15æ 13:35, Alexei Starovoitov wrote:
On Wed, Aug 15, 2018 at 08:29:45AM +0800, Jason Wang wrote:
Looks less flexible since the topology is hard coded in the XDP program
itself and this requires all logic to be implemented in the program on the
root netdev.

I have L3 forwarding working for vlan devices and bonds. I had not
considered macvlans specifically yet, but it should be straightforward
to add.

Yes, and all these could be done through XDP rx handler as well, and it can
do even more with rather simple logic:

1 macvlan has its own namespace, and want its own bpf logic.
2 Ruse the exist topology information for dealing with more complex setup
like macvlan on top of bond and team. There's no need to bpf program to care
about topology. If you look at the code, there's even no need to attach XDP
on each stacked device. The calling of xdp_do_pass() can try to pass XDP
buff to upper device even if there's no XDP program attached to current
layer.
3 Deliver XDP buff to userspace through macvtap.
I think I'm getting what you're trying to achieve.
You actually don't want any bpf programs in there at all.
You want macvlan builtin logic to act on raw packet frames.

The built-in logic is just used to find the destination macvlan device. It could be done by through another bpf program. Instead of inventing lots of generic infrastructure on kernel with specific userspace API, built-in logic has its own advantages:

- support hundreds or even thousands of macvlans
- using exist tools to configure network
- immunity to topology changes

It would have been less confusing if you said so from the beginning.

The name "XDP rx handler" is probably not good. Something like "stacked deivce XDP" might be better.

I think there is little value in such work, since something still
needs to process this raw frames eventually. If it's XDP with BPF progs
than they can maintain the speed, but in such case there is no need
for macvlan. The first layer can be normal xdp+bpf+xdp_redirect just fine.

I'm a little bit confused. We allow per veth XDP program, so I believe per macvlan XDP program makes sense as well? This allows great flexibility and there's no need to care about topology in bpf program. The configuration is also greatly simplified. The only difference is we can use xdp_redirect for veth since it was pair device, we can transmit XDP frames to one veth and do XDP on its peer. This does not work for the case of macvlan which is based on rx handler.

Actually, for the case of veth, if we implement XDP rx handler for bridge it can works seamlessly with veth like.

eth0(XDP_PASS) -> [bridge XDP rx handler and ndo_xdp_xmit()] -> veth --- veth (XDP).

Besides the usage for containers, we can implement macvtap RX handler which allows a fast packet forwarding to userspace.

In case where there is no xdp+bpf in final processing, the frames are
converted to skb and performance is lost, so in such cases there is no
need for builtin macvlan acting on raw xdp frames either. Just keep
existing macvlan acting on skbs.


Yes, this is how veth works as well.

Actually, the idea is not limited to macvlan but for all device that is based on rx handler. Consider the case of bonding, this allows to set a very simple XDP program on slaves and keep a single main logic XDP program on the bond instead of duplicating it in all slaves.

Thanks