Re: [net-next PATCH v3 7/8] net: dsa: realtek: rtl8365mb: add FDB support

From: Jakub Kicinski

Date: Sun May 10 2026 - 13:41:00 EST

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
---
net: dsa: realtek: rtl8365mb: add FDB support

Implement support for FDB and MDB management for the RTL8365MB series
switches. Introduce a mutex lock (l2_lock) to protect concurrent L2 table
updates. Add support for forwarding database operations, including unicast
and multicast entry handling as well as fast aging support.

> diff --git a/drivers/net/dsa/realtek/rtl8365mb_l2.c b/drivers/net/dsa/realtek/rtl8365mb_l2.c
> --- /dev/null
> +++ b/drivers/net/dsa/realtek/rtl8365mb_l2.c

[ ... ]

> +static void rtl8365mb_l2_data_to_uc(const u16 *data, struct rtl8365mb_l2_uc *uc)
> +{

[ ... ]

> + uc->key.efid = FIELD_GET(RTL8365MB_L2_UC_D4_EFID_MASK, data[4]);
> + uc->key.vid = FIELD_GET(RTL8365MB_L2_UC_D3_VID_MASK, data[3]);
> + uc->key.ivl = FIELD_GET(RTL8365MB_L2_UC_D3_IVL_MASK, data[3]);
> + uc->key.fid = FIELD_GET(RTL8365MB_L2_UC_D4_FID_MASK, data[4]);

Because vid and fid share the same union memory in struct rtl8365mb_l2_uc_key,
does setting fid immediately after vid overwrite the 12-bit vid value with
the 4-bit fid value?

When rtl83xx_port_fdb_dump() later passes entry.vid to the DSA core, it
appears this sends the corrupted fid value instead, breaking VLAN visibility
in userspace tools like bridge fdb show.

[ ... ]

> +static void rtl8365mb_l2_data_to_mc(const u16 *data, struct rtl8365mb_l2_mc *mc)
> +{

[ ... ]

> + mc->is_static = FIELD_GET(RTL8365MB_L2_MC_STATIC_MASK, data[5]);
> + mc->member = FIELD_GET(RTL8365MB_L2_MC_MBR_MASK, data[4]) |
> + (FIELD_GET(RTL8365MB_L2_MC_MBR_EXT1_MASK, data[3]) << 8) |
> + (FIELD_GET(RTL8365MB_L2_MC_MBR_EXT2_MASK, data[5]) << 8);

Could this shift for MBR_EXT2_MASK be incorrect?

It seems this shifts MBR_EXT2_MASK by 8 instead of 10. This causes the
extended bit for port 10 to collide with the EXT1 bits (which are correctly
shifted by 8) and leaves bit 10 perpetually unset.

In rtl8365mb_l2_mc_to_data(), the shift used is 10:

data[5] |= FIELD_PREP(RTL8365MB_L2_MC_MBR_EXT2_MASK, mc->member >> 10);

When the driver performs a read-modify-write operation on an existing
multicast entry, this shift mismatch might cause port 10 to be silently
dropped and port 8 to be erroneously altered.

[ ... ]

> +int rtl8365mb_l2_add_mc(struct realtek_priv *priv, int port,
> + const unsigned char mac_addr[static ETH_ALEN],
> + u16 vid)
> +{

[ ... ]

> + mc.member |= BIT(port);
> + mc.is_static = 1;
> + rtl8365mb_l2_mc_to_data(&mc, data);
> +
> + /* First look for an existing entry (to get existing port members) */
> + ret = rtl8365mb_table_query(priv, RTL8365MB_TABLE_L2,
> + RTL8365MB_TABLE_OP_READ, &addr,
> + RTL8365MB_TABLE_L2_METHOD_MAC, 0,
> + data, RTL8365MB_L2_ENTRY_SIZE);
> + if (!ret) {
> + /* There is already an entry... */
> + rtl8365mb_l2_data_to_mc(data, &mc);
> + /* the port must be added as a member */
> + mc.member |= BIT(port);
> + rtl8365mb_l2_mc_to_data(&mc, data);

If an entry already exists in hardware, rtl8365mb_l2_data_to_mc() overwrites
the mc struct with the hardware state. If the existing hardware entry was
dynamically learned (e.g., via IGMP snooping), is_static is overwritten to 0.

Would this cause the software-requested static MDB entry to be subject to
hardware aging?

> diff --git a/drivers/net/dsa/realtek/rtl83xx.c b/drivers/net/dsa/realtek/rtl83xx.c
> --- a/drivers/net/dsa/realtek/rtl83xx.c
> +++ b/drivers/net/dsa/realtek/rtl83xx.c

[ ... ]

> +int rtl83xx_port_fdb_dump(struct dsa_switch *ds, int port,
> + dsa_fdb_dump_cb_t *cb, void *data)
> +{

[ ... ]

> + mutex_lock(&priv->l2_lock);
> + while (true) {
> + start_addr = addr;
> +
> + dev_dbg(priv->dev, "l2_get_next_uc, addr:%d, port:%d\n",
> + addr, port);
> + ret = priv->ops->l2_get_next_uc(priv, &addr, port, &entry);
> + dev_dbg(priv->dev,
> + "l2_get_next_uc addr:%d mac:%pM vid:%d static:%d ret:%pe\n",
> + addr, entry.mac_addr, entry.vid, entry.is_static,
> + ERR_PTR(ret));
> +
> + if (ret == -ENOENT)
> + break;
> + if (ret)
> + break;
> +
> + if (addr < start_addr)
> + break;
> +
> + cb(entry.mac_addr, entry.vid, entry.is_static, data);

If the netlink socket buffer fills up, the callback returns -EMSGSIZE.
Since the return value of cb() is ignored here, does the loop fail to abort
and continue iterating the hardware table?

This would result in the netlink subsystem receiving an incorrect success
(or -ENOENT) instead of -EMSGSIZE, preventing it from resuming the dump later
and silently truncating FDB lists.

> +
> + addr++;
> +
> + /* Avoid querying beyond the valid L2 table range. */
> + if (addr > l2_table_size)
> + break;
> + }
> + mutex_unlock(&priv->l2_lock);
> +
> + return ret;

When l2_get_next_uc() reaches the end of the hardware table, it returns
-ENOENT.

Because ret is not reset to 0, this returns -ENOENT to the DSA core upon
successful completion. Should this return 0 on success to adhere to the
expected API contract?