Re: [PATCH RFC net 0/2] net: dsa: felix: fix VLAN-unaware reception

From: Robert Hodaszi
Date: Mon Dec 16 2024 - 09:39:51 EST



On Monday, 16.12.2024 at 14:51 +0100, Vladimir Oltean <vladimir.oltean@xxxxxxx> wrote:
>
> The memory is starting to come back :-|
>
> Ok, so the good news is that you aren't seeing things, I can reproduce
> with tools/testing/selftests/drivers/net/dsa/https://linkprotect.cudasvc.com/url?a=https%3a%2f%2flocal_termination.sh&c=E,1,Ez0FDT_USqvD0083KxZU7x7ffGuJDoeCC6xMtetczJrwErBfCEyO1pnImOnY_ifhHDMKhhPtJGv8MpKk1zKoqa6Gm1JP-zTkotP2AOShxr3N&typo=1
>
> Another good thing is that the fix is easier than your posted attempt.
> You've correctly identified the previous VLAN stripping logic, and that
> is what we should go forward with. I don't agree with your analysis that
> it wouldn't work, because if you look at the implementation of
> skb_vlan_untag(), it strips the VLAN header from the skb head, but still
> keeps it in the hwaccel area, so packets are still VLAN-tagged.
>
> This does not have a functional impact upon reception, it is just done
> to have unified handling later on in the function:
> skb_vlan_tag_present() and skb_vlan_tag_get_id(). This side effect is
> also mentioned as a comment on dsa_software_vlan_untag().
>
> The stripping itself will only take place in dsa_software_untag_vlan_unaware_bridge()
> if the switch driver sets dp->ds->untag_bridge_pvid. The felix driver
> does not set this.
>
> What is not so good is that I'm seriously starting to doubt my sanity.
> You'd think that I ran the selftests that I had posted together with the
> patch introducing the bug, but somehow they fail :-| And not only that,
> but thoughts about this problem itself have since passed through my head,
> and I failed to correctly identify where the problem applies and where
> it does not. I'm sorry for that.
>
> I've just posted a fix to this bug, which I would like you to double-check
> and respond with review and test tags, or let me know if it doesn't work.
> https://linkprotect.cudasvc.com/url?a=https%3a%2f%2flore.kernel.org%2fnetdev%2f20241216135059.1258266-1-vladimir.oltean%40nxp.com%2f&c=E,1,iMsl_DfLMdZXF3FfFIT1CISQcjOL417WIsr7z01GodEy-1vyX9d_6X-8hFJih2CA2zAax4kx2mFdtftzn-ELRkGDBCa9lxIWU_wEN8dtO2aVO7NS7ck,&typo=1
> I posted it myself because I don't expect you to have the full context
> (it's a bug that I introduced), and with yours there are still a lot of
> unanswered "why"s, as well as not the simplest solution.

Actually, what you did is exactly what I did first to fix the issue, but it broke my setup when I sent VLAN-tagged messages to the device. Now I tested again, and it is working fine. That made me think it's happening because it is stripping incorrectly the VLAN tag. Probably it was just an incorrect setup, maybe something remained set either on my PC or on the unit from the previous test.

One thing is different to my change though: you're calling the br_vlan_get_proto() twice. You can tweak performance a bit probably, if you rather pass 'proto' to both dsa_software_untag_vlan_aware_bridge and dsa_software_untag_vlan_unaware_bridge instead. So something like this:

diff --git a/net/dsa/tag.h b/net/dsa/tag.h
index d5707870906b..3d790d8e16cd 100644
--- a/net/dsa/tag.h
+++ b/net/dsa/tag.h
@@ -57,15 +57,11 @@ static inline struct net_device *dsa_conduit_find_user(struct net_device *dev,
*/
static inline void dsa_software_untag_vlan_aware_bridge(struct sk_buff *skb,
struct net_device *br,
- u16 vid)
+ u16 vid, u16 proto)
{
- u16 pvid, proto;
+ u16 pvid;
int err;

- err = br_vlan_get_proto(br, &proto);
- if (err)
- return;
-
err = br_vlan_get_pvid_rcu(skb->dev, &pvid);
if (err)
return;
@@ -103,16 +99,12 @@ static inline void dsa_software_untag_vlan_aware_bridge(struct sk_buff *skb,
*/
static inline void dsa_software_untag_vlan_unaware_bridge(struct sk_buff *skb,
struct net_device *br,
- u16 vid)
+ u16 vid, u16 proto)
{
struct net_device *upper_dev;
- u16 pvid, proto;
+ u16 pvid;
int err;

- err = br_vlan_get_proto(br, &proto);
- if (err)
- return;
-
err = br_vlan_get_pvid_rcu(skb->dev, &pvid);
if (err)
return;
@@ -149,14 +141,19 @@ static inline struct sk_buff *dsa_software_vlan_untag(struct sk_buff *skb)
{
struct dsa_port *dp = dsa_user_to_port(skb->dev);
struct net_device *br = dsa_port_bridge_dev_get(dp);
- u16 vid;
+ u16 vid, proto;
+ int err;

/* software untagging for standalone ports not yet necessary */
if (!br)
return skb;

+ err = br_vlan_get_proto(br, &proto);
+ if (err)
+ return skb;
+
/* Move VLAN tag from data to hwaccel */
- if (!skb_vlan_tag_present(skb)) {
+ if (!skb_vlan_tag_present(skb) && skb->protocol == htons(proto)) {
skb = skb_vlan_untag(skb);
if (!skb)
return NULL;
@@ -169,10 +166,12 @@ static inline struct sk_buff *dsa_software_vlan_untag(struct sk_buff *skb)

if (br_vlan_enabled(br)) {
if (dp->ds->untag_vlan_aware_bridge_pvid)
- dsa_software_untag_vlan_aware_bridge(skb, br, vid);
+ dsa_software_untag_vlan_aware_bridge(skb, br, vid,
+ proto);
} else {
if (dp->ds->untag_bridge_pvid)
- dsa_software_untag_vlan_unaware_bridge(skb, br, vid);
+ dsa_software_untag_vlan_unaware_bridge(skb, br, vid,
+ proto);
}

return skb;


Robert