Re: [PATCH net-next v2 9/9] net: dsa: lan9645x: add port statistics

From: Jens Emil Schulz Ostergaard

Date: Tue Apr 07 2026 - 04:30:34 EST


On Sun, 2026-03-29 at 12:56 -0700, Jakub Kicinski wrote:
> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
>
> This is an AI-generated review of your patch. The human sending this
> email has considered the AI review valid, or at least plausible.
> ---
> net: dsa: lan9645x: add port statistics
>
> Add statistics support for the port counters. Chip registers are 32 bit,
> so this unit is responsible maintaining a 64bit software cache, and
> updating it frequently to handle overflows in hardware.
>
> > diff --git a/drivers/net/dsa/microchip/lan9645x/lan9645x_main.c b/drivers/net/dsa/microchip/lan9645x/lan9645x_main.c
> > index 24a60f40f6b8..9a46c08dad59 100644
> > --- a/drivers/net/dsa/microchip/lan9645x/lan9645x_main.c
> > +++ b/drivers/net/dsa/microchip/lan9645x/lan9645x_main.c
> [ ... ]
> > @@ -262,6 +264,12 @@ static int lan9645x_setup(struct dsa_switch *ds)
> > if (!lan9645x->owq)
> > return -ENOMEM;
> >
> > + err = lan9645x_stats_init(lan9645x);
> > + if (err) {
> > + dev_err(dev, "Failed to init stats.\n");
> > + goto owq_destroy;
> > + }
> > +
> [ ... ]
> > diff --git a/drivers/net/dsa/microchip/lan9645x/lan9645x_stats.c b/drivers/net/dsa/microchip/lan9645x/lan9645x_stats.c
> > index 000000000000..3e82a859f173
> > --- /dev/null
> > +++ b/drivers/net/dsa/microchip/lan9645x/lan9645x_stats.c
> [ ... ]
> > +static const struct lan9645x_ethtool_stat lan9645x_port_ethtool_stats[] = {
> > + { "rx_uc", SCNT_RX_UC },
> > + { "rx_cat_drop", SCNT_RX_CAT_DROP },
> [ ... ]
> > + { "tx_uc", SCNT_TX_UC },
> > + { "tx_drop", SCNT_TX_DROP },
> [ ... ]
> > +};
>
> Is it necessary to duplicate standard networking statistics in the driver
> private ethtool output?
>
> The networking subsystem guidelines state that stats possessing a standard UAPI
> representation, such as tx_dropped and rx_packets in struct rtnl_link_stats64,
> should not be duplicated in the ethtool -S output.
>
> Could these redundant strings be removed since the driver natively supports
> them via lan9645x_stats_get_stats64?
>


I do not think they are redundant. They are only available in an aggregated form
via rtnl_link_stats64, not directly. Information is lost if they are removed.

> [ ... ]
> > +static const struct lan9645x_view_stats lan9645x_view_stat_cfgs[] = {
> > + [LAN9645X_STAT_PORTS] = {
> > + .name = "ports",
> > + .type = LAN9645X_STAT_PORTS,
> > + .layout = lan9645x_port_stats_layout,
> > + .num_cnts = ARRAY_SIZE(lan9645x_port_stats_layout),
> > + .num_indexes = NUM_PHYS_PORTS,
> > + },
> > +};
> [ ... ]
> > +static u64 *lan9645x_stats_view_idx_update(struct lan9645x *lan9645x,
> > + enum lan9645x_view_stat_type vtype,
> > + int idx)
> > +{
> > + struct lan9645x_stats *s = lan9645x->stats;
> > +
> > + mutex_lock(&s->hw_lock);
> > + __lan9645x_stats_view_idx_update(lan9645x, vtype, idx);
> > + mutex_unlock(&s->hw_lock);
> > +
> > + return STAT_COUNTERS(lan9645x, vtype, idx);
> > +}
>
> Does this code overflow the allocated buffer when called for the CPU port?
>
> The stats buffer cnts is allocated using num_indexes, which is NUM_PHYS_PORTS
> (9). This covers ports 0 to 8. However, the CPU port is defined as index 9
> (CPU_PORT = 9).
>
> When the DSA core invokes stat callbacks like get_ethtool_stats or get_stats64
> for the CPU port, the idx argument is 9. While the helper
> __lan9645x_stats_view_idx_update skips the hardware read for out-of-bounds
> indices, lan9645x_stats_view_idx_update ignores that failure and returns the
> result of STAT_COUNTERS(lan9645x, vtype, idx).
>
> This computes a pointer past the end of the cnts array, which the callers then
> blindly dereference, potentially leaking kernel memory or causing a crash. Can
> bounds checking be added to the callers, or should the buffer be sized to
> include the CPU port?
>

No this is not called for the CPU port index 9.

> [ ... ]
> > +void lan9645x_stats_get_ethtool_stats(struct lan9645x *lan9645x, int port,
> > + u64 *data)
> > +{
> > + struct lan9645x_stats *stats = lan9645x->stats;
> > + u64 *c;
> > + int i;
> > +
> > + c = lan9645x_stats_view_idx_update(lan9645x, LAN9645X_STAT_PORTS, port);
> > +
> > + spin_lock(&stats->sw_lock);
> > + for (i = 0; i < ARRAY_SIZE(lan9645x_port_ethtool_stats); i++)
> > + *data++ = c[lan9645x_port_ethtool_stats[i].idx];
> > + spin_unlock(&stats->sw_lock);
> > +}
> [ ... ]
> > +/* Called in atomic context */
> > +void lan9645x_stats_get_stats64(struct lan9645x *lan9645x, int port,
> > + struct rtnl_link_stats64 *stats)
> > +{
> > + struct lan9645x_stats *s = lan9645x->stats;
> > + u64 *c;
> > +
> > + c = STAT_COUNTERS(lan9645x, LAN9645X_STAT_PORTS, port);
> > +
> > + spin_lock(&s->sw_lock);
>
> Can this code deadlock when a softirq interrupts a process holding the lock?
>
> The sw_lock spinlock protects the 64-bit software stats counters. In
> lan9645x_stats_get_stats64, which acts as the ndo_get_stats64 callback, this
> lock is acquired using a plain spin_lock. The code comment notes this function
> is called in atomic context (such as from NET_RX_SOFTIRQ via dev_get_stats).
>
> Concurrently, process context paths like lan9645x_stats_get_ethtool_stats and
> lan9645x_check_stats_work also acquire sw_lock using plain spin_lock. Because
> plain spin_lock does not disable softirqs, if a softirq preempts the execution
> on a CPU while process context holds sw_lock, the softirq will spin
> indefinitely waiting for the lock.
>
> Should the process-context acquisitions use spin_lock_bh instead?
>

As far as I can tell ndo_get_stats64 is not called from softirq. Ocelot also uses
a plain spin_lock here. Since no softirq path touches this lock, there is no
deadlock.

> [ ... ]
> > +int lan9645x_stats_init(struct lan9645x *lan9645x)
> > +{
> > + const struct lan9645x_view_stats *vs;
> > + struct lan9645x_stats *stats;
> > + int err, i;
> > +
> > + lan9645x->stats = devm_kzalloc(lan9645x->dev, sizeof(*stats),
> > + GFP_KERNEL);
>
> Will this memory be leaked when the DSA switch unbinds and rebinds?
>
> The lan9645x_stats_init function allocates lan9645x->stats, vstat->cnts, and
> vstat->buf using devm_kzalloc and devm_kcalloc. This function is called from
> lan9645x_setup, which is the DSA .setup callback.
>
> The DSA tree lifecycle is independent of the platform device lifecycle. If the
> master netdevice is unbound and bound again, the DSA switch's .teardown and
> .setup callbacks will be invoked multiple times. Because .teardown cannot free
> devres-managed memory, the memory tied to the platform device will be leaked
> on every setup cycle until the switch's platform device itself is destroyed.
>
> Could this use standard kzalloc allocations paired with kfree in teardown?

Similar comment as the devm_* comment for the .setup function. I do not think
this is a problem, since there can not be multiple setup/teardown cycles
during the device lifetime. dsa_tree_teardown is only called from 
dsa_switch_remove which is called by the switch drivers .remove.


As far as I can tell both felix and ksz use device managed allocations in their
setup callback.