Re: [RFC V2 PATCH 17/25] net/netpolicy: introduce netpolicy_pick_queue

From: Daniel Borkmann
Date: Thu Aug 04 2016 - 18:39:24 EST


On 08/04/2016 10:21 PM, John Fastabend wrote:
On 16-08-04 12:36 PM, kan.liang@xxxxxxxxx wrote:
From: Kan Liang <kan.liang@xxxxxxxxx>

To achieve better network performance, the key step is to distribute the
packets to dedicated queues according to policy and system run time
status.

This patch provides an interface which can return the proper dedicated
queue for socket/task. Then the packets of the socket/task will be
redirect to the dedicated queue for better network performance.

For selecting the proper queue, currently it uses round-robin algorithm
to find the available object from the given policy object list. The
algorithm is good enough for now. But it could be improved by some
adaptive algorithm later.

The selected object will be stored in hashtable. So it does not need to
go through the whole object list every time.

Signed-off-by: Kan Liang <kan.liang@xxxxxxxxx>
---
include/linux/netpolicy.h | 5 ++
net/core/netpolicy.c | 136 ++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 141 insertions(+)

There is a hook in the tx path now (recently added)

# ifdef CONFIG_NET_EGRESS
if (static_key_false(&egress_needed)) {
skb = sch_handle_egress(skb, &rc, dev);
if (!skb)
goto out;
}
# endif

that allows pushing any policy you like for picking tx queues. It would
be better to use this mechanism. The hook runs 'tc' classifiers so
either write a new ./net/sch/cls_*.c for this or just use ebpf to stick
your policy in at runtime.

I'm out of the office for a few days but when I get pack I can test that
it actually picks the selected queue in all cases I know there was an
issue with some of the drivers using select_queue awhile back.

+1, I tried to bring this up here [1] in the last spin. I think only very
few changes would be needed, f.e. on eBPF side to add a queue setting
helper function which is probably straight forward ~10loc patch; and with
regards to actually picking it up after clsact egress, we'd need to adapt
__netdev_pick_tx() slightly when CONFIG_XPS so it doesn't override it.

[1] http://www.spinics.net/lists/netdev/msg386953.html