Re: [PATCH net-next v4 9/9] net: dsa: lan9645x: add port statistics

From: Jakub Kicinski

Date: Tue May 05 2026 - 21:49:39 EST

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
---
net: dsa: lan9645x: add port statistics

This patch wires up ethtool -S, standard ethtool group stats,
and ndo_get_stats64 for the lan9645x DSA driver. A delayed work
polls the 32-bit hardware counters every 3 seconds and accumulates
them into a 64-bit software cache to handle hardware wraparound.

> diff --git a/drivers/net/dsa/microchip/lan9645x/lan9645x_main.c b/drivers/net/dsa/microchip/lan9645x/lan9645x_main.c
> index 15a493c1f215..cea46bfe729e 100644
> --- a/drivers/net/dsa/microchip/lan9645x/lan9645x_main.c
> +++ b/drivers/net/dsa/microchip/lan9645x/lan9645x_main.c

[ ... ]

> @@ -255,6 +257,12 @@ static int lan9645x_setup(struct dsa_switch *ds)
> if (!lan9645x->owq)
> return -ENOMEM;
>
> + err = lan9645x_stats_init(lan9645x);
> + if (err) {
> + dev_err(dev, "Failed to init stats.\n");
> + goto owq_destroy;
> + }
> +
> ds->mtu_enforcement_ingress = true;
> ds->assisted_learning_on_cpu_port = true;
> ds->fdb_isolation = true;
> @@ -265,6 +273,10 @@ static int lan9645x_setup(struct dsa_switch *ds)
> lan9645x->num_phys_ports - lan9645x->num_port_dis);
>
> return 0;
> +
> +owq_destroy:
> + destroy_workqueue(lan9645x->owq);
> + return err;
> }

Is this owq_destroy error path reachable today?

Looking at lan9645x_stats_init():

int lan9645x_stats_init(struct lan9645x *lan9645x)
{
struct lan9645x_stats *stats = lan9645x->stats;

queue_delayed_work(stats->queue, &stats->work,
LAN9645X_STATS_CHECK_DELAY);

return 0;
}

it unconditionally returns 0, so the goto owq_destroy branch is dead.

If a future change either makes lan9645x_stats_init() return an error, or
adds any fallible step after it in lan9645x_setup(), the cleanup would
fall through owq_destroy without calling lan9645x_stats_deinit(), leaving
the self-requeuing delayed work armed on stats->queue.

Also, dsa_switch_setup() in net/dsa/dsa.c does not invoke ds->ops->teardown
when ops->setup() itself fails, so lan9645x_teardown (which is where
lan9645x_stats_deinit is called today) will not run in that scenario.

Then in lan9645x_remove() -> lan9645x_stats_free() -> destroy_workqueue(),
the queue would be drained with a chained self-requeue still live. Would
adding a dedicated label that calls lan9645x_stats_deinit() before falling
through to owq_destroy be safer for future edits?

[ ... ]

> diff --git a/drivers/net/dsa/microchip/lan9645x/lan9645x_stats.c b/drivers/net/dsa/microchip/lan9645x/lan9645x_stats.c
> new file mode 100644
> index 000000000000..46ac62b4016f
> --- /dev/null
> +++ b/drivers/net/dsa/microchip/lan9645x/lan9645x_stats.c

[ ... ]

> +static const struct lan9645x_ethtool_stat lan9645x_port_ethtool_stats[] = {
> + { "rx_cat_drop", SCNT_RX_CAT_DROP },
> + { "rx_red_prio_0", SCNT_RX_RED_PRIO_0 },

Per Documentation/networking/statistics.rst and the netdev stats hierarchy,
counters that have a standard uAPI must not be duplicated in ethtool -S.

The rx_cat_drop counter is summed into stats->rx_dropped in
lan9645x_stats_get_stats64():

stats->rx_dropped = c[SCNT_RX_LONG] +
c[SCNT_DR_LOCAL] +
c[SCNT_DR_TAIL] +
c[SCNT_RX_CAT_DROP] +
...

Since rtnl_link_stats64.rx_dropped is reachable via "ip -s link show",
does exposing rx_cat_drop here duplicate a value already covered by the
standard uAPI?

[ ... ]

> + { "tx_drop", SCNT_TX_DROP },

Similarly, tx_drop feeds stats->tx_dropped:

stats->tx_dropped = c[SCNT_TX_DROP] + c[SCNT_TX_AGED];

Should tx_drop be dropped from the ethtool -S list for the same reason?

[ ... ]

> + { "tx_aged", SCNT_TX_AGED },
> + { "tx_bufdrop", SCNT_TX_BUFDROP },

tx_aged is also summed into stats->tx_dropped in
lan9645x_stats_get_stats64(), so does it fall under the same duplicate-uAPI
concern as tx_drop?

A related question about tx_bufdrop: if SCNT_TX_BUFDROP represents transmit
frames dropped due to buffer exhaustion, why is it not included in
stats->tx_dropped alongside SCNT_TX_DROP and SCNT_TX_AGED?

stats->tx_dropped = c[SCNT_TX_DROP] + c[SCNT_TX_AGED];

If tx_bufdrop is in fact a transmit-drop cause, would users of
ip -s link show see an under-reported tx_dropped relative to the private
ethtool counter?

> + { "dr_local", SCNT_DR_LOCAL },
> + { "dr_tail", SCNT_DR_TAIL },

dr_local and dr_tail are also fed into stats->rx_dropped:

stats->rx_dropped = c[SCNT_RX_LONG] +
c[SCNT_DR_LOCAL] +
c[SCNT_DR_TAIL] +
...

Would it be appropriate to drop these from the ethtool -S list as well,
given they are already aggregated into the standard rx_dropped counter?

[ ... ]

> +};

[ ... ]

> +static void lan9645x_check_stats_work(struct work_struct *work)
> +{
> + struct delayed_work *del_work = to_delayed_work(work);
> + struct lan9645x_stats *stats;
> +
> + stats = container_of(del_work, struct lan9645x_stats, work);
> +
> + lan9645x_stats_update(stats->lan9645x);
> +
> + queue_delayed_work(stats->queue, &stats->work,
> + LAN9645X_STATS_CHECK_DELAY);
> +}

The commit message says this worker "update[s] it frequently to handle
overflows in hardware."

Looking at lan9645x_stats_add_cnt():

static inline void lan9645x_stats_add_cnt(u64 *cnt, u32 val)
{
if (val < (*cnt & U32_MAX))
*cnt += (u64)1 << 32; /* value has wrapped */

*cnt = (*cnt & ~(u64)U32_MAX) + val;
}

this compensates for at most one 32-bit wrap per polling interval. If two
or more wraps happen between reads, the low-half comparison no longer
detects the missed wrap and the 64-bit counter silently loses 2^32 per
missed wrap.

At the chip's 2.5 Gbps line rate (LAN9645X_SPEED_2500 in lan9645x_main.h),
a 32-bit byte counter wraps roughly every 13.7 s, so two wraps happen in
about 27.5 s. The polling interval is LAN9645X_STATS_CHECK_DELAY = 3 * HZ,
but there is no upper bound on the actual elapsed time between reads:
delayed work can be stretched by system suspend/resume, heavy CPU load, or
regmap/SPI contention.

Would it be worthwhile to either bound the worst-case polling gap
(e.g. a suspend/resume hook that forces a read, or a timestamp-based
sanity check that detects a stretched interval and logs a warning) so
that rx_bytes/tx_bytes reported to ndo_get_stats64 and ethtool cannot
silently undercount by multiples of 2^32?

[ ... ]