[PATCH net-next v4 0/3] make skip_sw actually skip software
From: Asbjørn Sloth Tønnesen
Date: Mon Mar 25 2024 - 16:55:08 EST
Hi,
During development of flower-route[1], which I
recently presented at FOSDEM[2], I noticed that
CPU usage, would increase the more rules I installed
into the hardware for IP forwarding offloading.
Since we use TC flower offload for the hottest
prefixes, and leave the long tail to the normal (non-TC)
Linux network stack for slow-path IP forwarding.
We therefore need both the hardware and software
datapath to perform well.
I found that skip_sw rules, are quite expensive
in the kernel datapath, since they must be evaluated
and matched upon, before the kernel checks the
skip_sw flag.
This patchset optimizes the case where all rules
are skip_sw, by implementing a TC bypass for these
cases, where TC is only used as a control plane
for the hardware path.
v4:
- Rebased onto net-next, now that net-next is open again
v3: https://lore.kernel.org/netdev/20240306165813.656931-1-ast@xxxxxxxxxxx/
- Patch 3:
- Fix source_inline
- Fix build failure, when CONFIG_NET_CLS without CONFIG_NET_CLS_ACT.
v2: https://lore.kernel.org/netdev/20240305144404.569632-1-ast@xxxxxxxxxxx/
- Patch 1:
- Add Reviewed-By from Jiri Pirko
- Patch 2:
- Move code, to avoid forward declaration (Jiri).
- Patch 3
- Refactor to use a static key.
- Add performance data for trapping, or sending
a packet to a non-existent chain (as suggested by Marcelo).
v1: https://lore.kernel.org/netdev/20240215160458.1727237-1-ast@xxxxxxxxxxx/
[1] flower-route
https://github.com/fiberby-dk/flower-route
[2] FOSDEM talk
https://fosdem.org/2024/schedule/event/fosdem-2024-3337-flying-higher-hardware-offloading-with-bird/
Asbjørn Sloth Tønnesen (3):
net: sched: cls_api: add skip_sw counter
net: sched: cls_api: add filter counter
net: sched: make skip_sw actually skip software
include/net/pkt_cls.h | 9 +++++++++
include/net/sch_generic.h | 4 ++++
net/core/dev.c | 10 ++++++++++
net/sched/cls_api.c | 41 +++++++++++++++++++++++++++++++++++++++
4 files changed, 64 insertions(+)
--
2.43.0