Re: [PATCH v3 RFC 2/4] net: dsa: Extend ksz9477 TAG setup to support HSR frames duplication

From: Vladimir Oltean
Date: Tue Sep 05 2023 - 12:20:50 EST


On Tue, Sep 05, 2023 at 12:44:09PM +0200, Lukasz Majewski wrote:
> > Not to mention that there are other problems with the "dev->hsr_ports"
> > concept. For example, having a hsr0 over lan0 and lan1, and a hsr1
> > over lan2 and lan3, would set dev->hsr_ports to GENMASK(3, 0).
>
> I doubt that having two hsr{01} interfaces is possible with current
> kernel.

You mean 2 hsr{01} interfaces not being able to coexist in general,
or just "offloaded" ones?

> The KSZ9477 allows only to have 2 ports of 5 available as HSR
> ones.
>
> The same is with earlier chip xrs700x (but this have even bigger
> constrain - there only ports 1 and 2 can support HSR).

> > > + if (dev->features & NETIF_F_HW_HSR_DUP) {
> > > + val &= ~KSZ9477_TAIL_TAG_LOOKUP;
> >
> > No need to unset a bit which was never set.
>
> I've explicitly followed the vendor's guidelines - the TAG_LOOKUP needs
> to be cleared.
>
> But if we can assure that it is not set here I can remove it.

Let's look at ksz9477_xmit(), filtering only for changes to "u16 val".

static struct sk_buff *ksz9477_xmit(struct sk_buff *skb,
struct net_device *dev)
{
u16 val;

val = BIT(dp->index);

val |= FIELD_PREP(KSZ9477_TAIL_TAG_PRIO, prio);

if (is_link_local_ether_addr(hdr->h_dest))
val |= KSZ9477_TAIL_TAG_OVERRIDE;

if (dev->features & NETIF_F_HW_HSR_DUP) {
val &= ~KSZ9477_TAIL_TAG_LOOKUP;
val |= ksz_hsr_get_ports(dp->ds);
}
}

Is KSZ9477_TAIL_TAG_LOOKUP ever set in "val", or am I missing something?

> > > + val |= ksz_hsr_get_ports(dp->ds);
> > > + }
> >
> > Would this work instead?
> >
> > struct net_device *hsr_dev = dp->hsr_dev;
> > struct dsa_port *other_dp;
> >
> > dsa_hsr_foreach_port(other_dp, dp->ds, hsr_dev)
> > val |= BIT(other_dp->index);
> >
>
> I thought about this solution as well, but I've been afraid, that going
> through the loop of all 5 ports each time we want to send single packet
> will reduce the performance.
>
> Hence, the idea with having the "hsr_ports" set once during join
> function and then use this cached value afterwards.

There was a quote about "premature optimization" which I can't quite remember...

If you can see a measurable performance difference, then the list
traversal can be converted to something more efficient.

In this case, struct dsa_port :: hsr_dev can be converted to a larger
struct dsa_hsr structure, similar to struct dsa_port :: bridge.
That structure could look like this:

struct dsa_hsr {
struct net_device *dev;
unsigned long port_mask;
refcount_t refcount;
};

and you could replace the list traversal with "val |= dp->hsr->port_mask".
But a more complex solution requires a justification, which in this case
is performance-related. So performance data must be gathered.

FWIW, dsa_master_find_slave() also performs a list traversal.
But similar discussions about performance improvements didn't lead anywhere.