Re: [PATCH RFC net 2/2] net: dsa: propagate brentry flag changes
From: Vladimir Oltean
Date: Mon Apr 14 2025 - 11:16:52 EST
On Mon, Apr 14, 2025 at 03:49:27PM +0200, Jonas Gorski wrote:
> I was just in the progress of writing down some thoughts myself I had
> while thinking about this.
>
> So to me passing on the original flags but then no updates of these
> flags feels wrong. I would expect them to be either never passed on,
> or always sync it (and let the driver choose to handle or ignore
> them).
Maybe, but right now it seems like the wrong problem to tackle, and it
is probably so for every driver for which the VLAN information in the
receive traffic depends on the bridge VLAN configuration, and is not
statically configured by the driver.
> Having the cpu port as egress untagged I can easily find one case
> where this breaks stuff (at least with b53):
>
> With lan1, lan2 being dsa ports of the same switch:
>
> # bridge with lan1
> $ ip link add swbridge type bridge vlan_filtering 1
> $ ip link set lan1 master swbridge
> $ bridge vlan add dev swbridge vid 10 self pvid untagged
>
> # lan2 stand alone
> $ ip link add lan2.10 link lan2 type vlan id 10
>
> as then lan2.10 would never receive any packets, as the VLAN 10
> packets received by the CPU ports never carry any vlan tags.
>
> The core issue here is that b53 switches do not provide any way of
> knowing the original tagged state of received packets, as the dsa
> header has no field for that (bcm56* switches do, but these are a
> different beast).
I see, and indeed, this is yet another angle. The flags of the host
bridge VLAN do not match with the flags of the flags of the RX filtering
VLAN, the latter having this comment: "This API only allows programming
tagged, non-PVID VIDs". The update of flags would not be propagated to
the driver, neither with your patch nor without it, because VID=10
already exists on the CPU port, and this isn't a "changed" VLAN (because
it is an artificial switchdev event emitted by DSA, unbeknownst to the
bridge). So DSA would still decide to bump the refcount rather than
notify the driver.
You'd have to ask yourself how do you even expect DSA to react and sort
this out, between the bridge direction wanting the VLAN untagged and the
8021q direction wanting it tagged.
> I guess the proper fix for b53 is probably to always have a vlan tag
> on the cpu port (except for the special vlan 0 for untagged traffic on
> ports with no PVID), and enable untag_vlan_aware_bridge_pvid.
What's the story with the ports with no PVID, and VID 0?
In Documentation/networking/switchdev.rst, it is said that VLAN
filtering bridge ports should drop untagged and VID 0 tagged RX packets
when there is no pvid.
> To continue the stream of consciousness, it probably does not make
> sense to pass on the untagged flag for the bridge/cpu port, because it
> will affect all ports of the switch, regardless of them being member
> of the bridge.
Though it needs to be said, usually standalone ports are VLAN-unaware,
thus, the VLAN ID on RX from their direction is a discardable quantity.
b53 is one of the special drivers, for setting ds->vlan_filtering_is_global = true.
That makes standalone ports become VLAN filtering even when not under a
bridge, and is what ultimately causes DSA to program RX filtering VLANs
to hardware in the first place. Normally, 8021q uppers aren't programmed
to hardware - see the comments above dsa_user_manage_vlan_filtering().
> Looking through drivers in net/drivers/dsa, I don't see
> anyone checking if egress untag is applied to the cpu port, so I
> wonder if not most, maybe even all (dsa) switch drivers have the same
> issue and would actually need to keep the cpu port always tagged.
What check do you expect to see exactly? Many drivers treat VLANs on the
CPU port in special ways, sja1105, felix/ocelot, mv88e6xxx, mt7530, ksz8, maybe others.
Some of them are subtle and not easy to spot, because they are not from
the .port_vlan_add() call path (like felix_update_tag_8021q_rx_rule()).
> And looking through the tag_* handlers, only ocelot looks like it may
> have the information available whether a packet was originally tagged
> or not, so would also need to have untag_vlan_aware_bridge_pvid enabled.
ocelot can select between 2 different tagging protocols, "ocelot" (which
leaves the VLAN header unmodified) and "ocelot-8021q" (which does not),
and the latter indeed does set untag_vlan_aware_bridge_pvid. It's an
option from which more than one driver could benefit, though, for sure.
mv88e6xxx uses MV88E6XXX_G1_VTU_DATA_MEMBER_TAG_UNMODIFIED which
provides essentially that information, neither "tag" nor "untag", but
"keep".
> Makes the think the cpu port always tagged should be the default
> behavior. It's easy to strip a vlan tag that shouldn't be there, but
> hard to add one that's missing, especially since in contrast to PVID
> you can have more than one vlan as egress untagged.
I agree and I would like to see b53 converge towards that. But changing
the default by unsetting this flag in DSA could be a breaking change, we
should be careful, and definitely only consider that for net-next.
b53 already sets a form (the deprecated form) of ds->untag_bridge_pvid,
someone with hardware should compare its behavior to the issues
documented in dsa_software_untag_vlan_unaware_bridge(), and, if
necessary, transition it to ds->untag_vlan_aware_bridge_pvid or perhaps
something else.
There is a relevant selftest at tools/testing/selftests/net/forwarding/local_termination.sh
which captures many (though not all) gotchas, it might be interesting to
run it.