Re: [PATCH RFC net-next 3/3] net: dsa: deny 8021q uppers on vlan unaware bridged ports
From: Vladimir Oltean
Date: Tue Nov 11 2025 - 09:57:01 EST
On Tue, Nov 11, 2025 at 03:09:08PM +0100, Jonas Gorski wrote:
> On Tue, Nov 11, 2025 at 12:56 PM Vladimir Oltean <olteanv@xxxxxxxxx> wrote:
> >
> > On Tue, Nov 11, 2025 at 11:06:48AM +0100, Jonas Gorski wrote:
> > > But I noticed while testing that apparently b53 in filtering=0 mode
> > > does not forward any tagged traffic (and I think I know why ...).
> > >
> > > Is there a way to ask for a replay of the fdb (static) entries? To fix
> > > this for older switches, we need to disable 802.1q mode, but this also
> > > switches the ARL from IVL to SVL, which changes the hashing, and would
> > > break any existing entries. So we need to flush the ARL before
> > > toggling 802.1q mode, and then reprogram any static entries.
> >
> > I'm not clear on what happens. "Broken" FDB entries in the incorrect
> > bridge vlan_filtering mode sounds like normal behaviour (FDB entries
> > with VID=0 while vlan_filtering=1, or FDB entries with VID!=0 while
> > vlan_filtering=0). They should just sit idle in the ARL until the VLAN
> > filtering mode makes them active.
>
> When in SVL mode (vlan disabled), the ARL switches from mac+vid to
> just mac for hashing ARL entries. And I don't know if mac+vid=0 yields
> the same hash as only mac. It would it the switch uses vid=0 when not
> vlan aware, but if it skips the vid then it wouldn't.
>
> And we automatically install static entries for the MAC addresses of
> ports (and maybe other non-dsa bridged devices), so we may need to
> have these twice in the ARL table (once for non-filtering, once for
> filtering).
>
> If the hash is not the same, this can happen:
>
> vlan_enabled=1, ARL hashing uses mac+vid
> add static entry mac=abc,vid=0 for port 1 => hash(mac, vid) -> entry 123
> vlan_enabled => 0, ARL hashing uses only mac
> packet received on port 2 for mac=abc => hash(mac) => entry 456 => no
> entry found => flood (which may not include port 1).
>
> when trying to delete the static entry => lookup for mac=abc,vid=0 =>
> hash(mac) => entry 456 => no such entry.
>
> Then maybe we ignore the error, but the moment we enable vlan again,
> the hashing changes back to mac+vid, so the "deleted" static entry
> becomes active again (despite the linux fdb not knowing about it
> anymore).
>
> And even if the hash is the same, it would mean we cannot interact
> with any preexisting entries for vid!=1 that were added with vlan
> filtering = 1. So we cannot delete them when e.g. removing a port from
> the bridge, or deleting the bridge.
>
> So the safest would be to remove all static entries before changing
> vlan filtering, and then re-adding them afterwards.
>
> Best regards,
> Jonas
If you just want to debug whether this is the case or not, then as I
understand, for the moment you only care about static FDB entries on the
CPU port, not on user ports added with 'bridge fdb add ... master static".
If so, these FDB entries are available in the cpu_dp->fdbs list. For
user ports we don't bother keeping track.
Regarding switchdev FDB replay, it's possible but has very high
complexity. The base call would be to switchdev_bridge_port_replay(),
then you'd need to set up two parallel notifier blocks through which
you're informed of the existing objects (not the usual dsa_user_switchdev_notifier
and dsa_user_switchdev_blocking_notifier), whose internal processing is
partly similar (the event filtering and replication) and partly different:
instead of calling dsa_schedule_work() to program the FDB entries to
hardware, you just add them to a list that is kept in a context
structure, which is passed to the caller once the replay is over and the
list is complete.
For the moment, dp->fdbs should be sufficient to prove/disprove a point.