Re: [PATCH net-next v4 9/9] net: dsa: lan9645x: add port statistics
From: Jakub Kicinski
Date: Tue May 12 2026 - 19:39:53 EST
On Tue, 12 May 2026 10:47:22 +0200 Jens Emil Schulz Ostergaard wrote:
> > The commit message says this worker "update[s] it frequently to handle
> > overflows in hardware."
> >
> > Looking at lan9645x_stats_add_cnt():
> >
> > static inline void lan9645x_stats_add_cnt(u64 *cnt, u32 val)
> > {
> > if (val < (*cnt & U32_MAX))
> > *cnt += (u64)1 << 32; /* value has wrapped */
> >
> > *cnt = (*cnt & ~(u64)U32_MAX) + val;
> > }
> >
> > this compensates for at most one 32-bit wrap per polling interval. If two
> > or more wraps happen between reads, the low-half comparison no longer
> > detects the missed wrap and the 64-bit counter silently loses 2^32 per
> > missed wrap.
> >
> > At the chip's 2.5 Gbps line rate (LAN9645X_SPEED_2500 in lan9645x_main.h),
> > a 32-bit byte counter wraps roughly every 13.7 s, so two wraps happen in
> > about 27.5 s. The polling interval is LAN9645X_STATS_CHECK_DELAY = 3 * HZ,
> > but there is no upper bound on the actual elapsed time between reads:
> > delayed work can be stretched by system suspend/resume, heavy CPU load, or
> > regmap/SPI contention.
> >
> > Would it be worthwhile to either bound the worst-case polling gap
> > (e.g. a suspend/resume hook that forces a read, or a timestamp-based
> > sanity check that detects a stretched interval and logs a warning) so
> > that rx_bytes/tx_bytes reported to ndo_get_stats64 and ethtool cannot
> > silently undercount by multiples of 2^32?
> >
>
> It is true, and the timing is chosen so at most 1 wrap is supposed to occur.
> I believe this is the exact pattern already used by ocelot, lan966x, sparx5
> and lan969x.
>
> Maybe I misunderstand the suspend/resume comment. We do not implement
> support to suspend/resume, but if we did, then I assume you can suspend
> for an arbitrary amount of time, so doing an additional read on
> suspend/resume will not solve this problem?
>
> The hw counters are 32bit. To stay synced they must never wrap more than
> once. I think the only way avoid this problem is if we can make absolute
> guarantees about how often the polling code runs?
I think AI is probably asking for too much here. You could stash jiffies
on each work run, and detect potential overflow, but all you can do is
print a warning. During suspend there should be no traffic, so that's
bogus.