Re: [net-next,v3] net: dsa: mv88e6xxx: Add partial support for TCAM entries

From: Simon Horman

Date: Wed Feb 25 2026 - 05:28:57 EST


This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.

For local reproduction steps see https://netdev-ai.bots.linux.dev/ai-local.html
---
net: dsa: mv88e6xxx: Add partial support for TCAM entries

This patch adds partial TCAM support for mv88e6390 and mv88e6393 family
switches, allowing the switch to match frame headers and take actions on
matched frames via tc flower offload. It supports matching on IP
addresses and protocol, and trapping packets to the CPU.

> diff --git a/drivers/net/dsa/mv88e6xxx/tcam.c b/drivers/net/dsa/mv88e6xxx/tcam.c
> new file mode 100644
> index 000000000000..ed02c8798200
> --- /dev/null
> +++ b/drivers/net/dsa/mv88e6xxx/tcam.c

[ ... ]

> +/* insert tcam entry in ordered list and move existing entries if necessary */
> +static int mv88e6xxx_tcam_insert_entry(struct mv88e6xxx_chip *chip,
> + struct mv88e6xxx_tcam_entry *entry)
> +{
> + struct mv88e6xxx_tcam_entry *elem;
> + struct list_head *hpos;
> + int err;
> +
> + list_for_each_prev(hpos, &chip->tcam.entries) {
> + u8 move_idx;
> +
> + elem = list_entry(hpos, struct mv88e6xxx_tcam_entry, list);
> + if (entry->prio >= elem->prio)
> + break;
> +
> + move_idx = elem->hw_idx + 1;
> +
> + err = mv88e6xxx_tcam_flush_entry(chip, move_idx);
> + if (err)
> + return err;
> +
> + err = chip->info->ops->tcam_ops->entry_add(chip, elem,
> + move_idx);
> + if (err)
> + return err;
> +
> + elem->hw_idx = move_idx;
> + }

If entry_add() fails after some entries have already been moved to new
hardware positions (with their hw_idx updated), doesn't the function
return without rolling back those partial moves? The software list would
then have some entries pointing to their new positions while the gap for
the new entry is only partially created, leaving the TCAM in an
inconsistent state for subsequent operations.

[ ... ]

> +int mv88e6xxx_tcam_entry_del(struct mv88e6xxx_chip *chip,
> + struct mv88e6xxx_tcam_entry *entry)
> +{
> + struct mv88e6xxx_tcam_entry *elem = entry;
> + u8 move_idx = entry->hw_idx;
> + int err;
> +
> + err = mv88e6xxx_tcam_flush_entry(chip, entry->hw_idx);
> +
> + /* move entries that come after the deleted entry forward */
> + list_for_each_entry_continue(elem, &chip->tcam.entries, list) {
> + u8 tmp_idx = elem->hw_idx;
> +
> + err = chip->info->ops->tcam_ops->entry_add(chip, elem,
> + move_idx);
> +
> + elem->hw_idx = move_idx;
> + move_idx = tmp_idx;
> +
> + /* flush the last entry after moving entries */
> + if (list_is_last(&elem->list, &chip->tcam.entries))
> + err = mv88e6xxx_tcam_flush_entry(chip, tmp_idx);
> + }
> +
> + list_del(&entry->list);
> + return err;
> +}

Can errors be silently lost in mv88e6xxx_tcam_entry_del()? The initial
mv88e6xxx_tcam_flush_entry() result is stored in err, but then err is
unconditionally overwritten by entry_add() and the final flush_entry()
in the loop.

If the initial flush fails but subsequent operations succeed, the error
is lost and success is returned. Additionally, elem->hw_idx is updated
even when entry_add() fails, so the software model diverges from the
hardware state.

> diff --git a/drivers/net/dsa/mv88e6xxx/tcflower.c b/drivers/net/dsa/mv88e6xxx/tcflower.c
> new file mode 100644
> index 000000000000..693f4ad8f32f
> --- /dev/null
> +++ b/drivers/net/dsa/mv88e6xxx/tcflower.c

[ ... ]

> +static int mv88e6xx_flower_parse_key(struct mv88e6xxx_chip *chip,
> + struct netlink_ext_ack *extack,
> + struct flow_cls_offload *cls,
> + struct mv88e6xxx_tcam_key *key)
> +{
> + struct flow_rule *rule = flow_cls_offload_flow_rule(cls);
> + struct flow_dissector *dissector = rule->match.dissector;
> + u16 addr_type = 0;
> +
> + if (dissector->used_keys &
> + ~(BIT_ULL(FLOW_DISSECTOR_KEY_BASIC) |
> + BIT_ULL(FLOW_DISSECTOR_KEY_CONTROL) |
> + BIT_ULL(FLOW_DISSECTOR_KEY_IPV4_ADDRS) |
> + BIT_ULL(FLOW_DISSECTOR_KEY_IPV6_ADDRS))) {
> + NL_SET_ERR_MSG_MOD(extack,
> + "Unsupported keys used");
> + return -EOPNOTSUPP;
> + }

FLOW_DISSECTOR_KEY_IPV6_ADDRS is included in the allowed keys bitmask
but there is no code below to parse IPv6 addresses into the TCAM key.
If a user creates a tc flower rule with IPv6 addresses, the rule passes
this validation check but the addresses are never programmed into the
TCAM, causing the hardware entry to match broader traffic than intended.

Should FLOW_DISSECTOR_KEY_IPV6_ADDRS be removed from the allowed keys
until IPv6 parsing is implemented?

[ ... ]

> +int mv88e6xxx_cls_flower_add(struct dsa_switch *ds, int port,
> + struct flow_cls_offload *cls, bool ingress)
> +{

[ ... ]

> + entry = kzalloc(sizeof(*entry), GFP_KERNEL);
> + if (!entry) {
> + err = -EEXIST;
> + goto err_unlock;
> + }

Should this be -ENOMEM instead of -EEXIST? When kzalloc() fails, the
error code tells callers the entry already exists rather than reporting
the actual memory allocation failure. This appears to be a copy-paste
from the cookie-exists check immediately above. Note that this was
correct (-ENOMEM) in v2 of this patch and changed to -EEXIST in v3.

[ ... ]

> + err = mv88e6xxx_tcam_entry_add(chip, entry);
> + mv88e6xxx_reg_unlock(chip);
> + if (err)
> + goto err_free_entry;
> +
> + mv88e6xxx_reg_unlock(chip);
> + return 0;
> +
> +err_free_entry:
> + kfree(entry);
> +err_unlock:
> + mv88e6xxx_reg_unlock(chip);
> + return err;
> +}

Does this double unlock chip->reg_lock? On the success path,
mv88e6xxx_reg_unlock(chip) is called after mv88e6xxx_tcam_entry_add(),
and then called again before returning 0.

On the error path from mv88e6xxx_tcam_entry_add(), the unlock fires,
then goto err_free_entry falls through to err_unlock which unlocks a
second time.

mv88e6xxx_reg_lock/unlock wrap mutex_lock/mutex_unlock. Double
mutex_unlock causes undefined behavior and corrupts the mutex state.
chip->reg_lock is the central serialization mutex used by the entire
mv88e6xxx driver, so this corruption would affect all subsequent
register access operations. Andrew Lunn raised the locking pattern
as suspicious during the v2 review:
https://lore.kernel.org/netdev/ab9b4278-2c0a-4ed5-9da0-2a7a3f928ed7@xxxxxxx/

This bug appears to have been introduced in v3 when restructuring the
lock/unlock calls.