Re: [Patch net-next 0/3] Add support for mdb offload failure notification

From: Joseph Huang
Date: Thu Mar 20 2025 - 17:14:34 EST


On 3/20/2025 2:17 AM, Nikolay Aleksandrov wrote:
On 3/19/25 00:42, Joseph Huang wrote:
Currently the bridge does not provide real-time feedback to user space
on whether or not an attempt to offload an mdb entry was successful.

This patch set adds support to notify user space about successful and
failed offload attempts, and the behavior is controlled by a new knob
mdb_notify_on_flag_change:

0 - the bridge will not notify user space about MDB flag change
1 - the bridge will notify user space about flag change if either
MDB_PG_FLAGS_OFFLOAD or MDB_PG_FLAGS_OFFLOAD_FAILED has changed
2 - the bridge will notify user space about flag change only if
MDB_PG_FLAGS_OFFLOAD_FAILED has changed

The default value is 0.

A break-down of the patches in the series:

Patch 1 adds offload failed flag to indicate that the offload attempt
has failed. The flag is reflected in netlink mdb entry flags.

Patch 2 adds the knob mdb_notify_on_flag_change, and notify user space
accordingly in br_switchdev_mdb_complete() when the result is known.

Patch 3 adds netlink interface to manipulate mdb_notify_on_flag_change
knob.

This patch set was inspired by the patch series "Add support for route
offload failure notifications" discussed here:
https://lore.kernel.org/all/20210207082258.3872086-1-idosch@xxxxxxxxxx/

Joseph Huang (3):
net: bridge: mcast: Add offload failed mdb flag
net: bridge: mcast: Notify on offload flag change
net: bridge: Add notify on flag change netlink i/f

include/uapi/linux/if_bridge.h | 9 +++++----
include/uapi/linux/if_link.h | 14 ++++++++++++++
net/bridge/br_mdb.c | 30 +++++++++++++++++++++++++-----
net/bridge/br_multicast.c | 25 +++++++++++++++++++++++++
net/bridge/br_netlink.c | 21 +++++++++++++++++++++
net/bridge/br_private.h | 26 +++++++++++++++++++++-----
net/bridge/br_switchdev.c | 31 ++++++++++++++++++++++++++-----
7 files changed, 137 insertions(+), 19 deletions(-)


Hi,
Could you please share more about the motivation - why do you need this and
what will be using it?

Hi Nik,

The API for a user space application to join a multicast group is write-only (and really best-efforts only), meaning that after an application calls setsockopt(), the application has no way to know whether the operation actually succeeded or not. Normally for soft bridges this is not an issue; however for switchdev-backed bridges, due to limited hardware resources, the failure rate is meaningfully higher.

With this patch set, the user space application will now get a notification about a failed attempt to join a multicast group. The user space application can then have the opportunity to mitigate the failure [1][2].

Also why do you need an option with 3 different modes
instead of just an on/off switch for these notifications?

Thanks,
Nik


Some user space application might be interested in both successful and failed offload attempts (for example the application might want to keep an mdb database which is perfectly in sync with the hardware), while some other user space application might only be interested in failed attempts (so that it can retry the operation or choose a different group for example).

This knob is modeled after fib_notify_on_flag_change knob on route offload failure notification (see https://lore.kernel.org/all/20210207082258.3872086-4-idosch@xxxxxxxxxx/). The rationale is that "Separate value (read: 2) is added for such notifications because there are less of them, so they do not impact performance and some users will find them more important."

Thanks,
Joseph

--

[1] https://datatracker.ietf.org/doc/draft-ietf-pim-zeroconf-mcast-addr-alloc-ps/, section 2, the last paragraph
[2] https://datatracker.ietf.org/doc/draft-ietf-pim-ipv6-zeroconf-assignment/, section 2.1, the first paragraph