[PATCH 6.7 342/713] ice: fix stats being updated by way too large values

From: Sasha Levin
Date: Mon Mar 25 2024 - 06:56:22 EST


From: Przemek Kitszel <przemyslaw.kitszel@xxxxxxxxx>

[ Upstream commit 257310e998700e60382fbd3f4fd275fdbd9b2aaf ]

Simplify stats accumulation logic to fix the case where we don't take
previous stat value into account, we should always respect it.

Main netdev stats of our PF (Tx/Rx packets/bytes) were reported orders of
magnitude too big during OpenStack reconfiguration events, possibly other
reconfiguration cases too.

The regression was reported to be between 6.1 and 6.2, so I was almost
certain that on of the two "preserve stats over reset" commits were the
culprit. While reading the code, it was found that in some cases we will
increase the stats by arbitrarily large number (thanks to ignoring "-prev"
part of condition, after zeroing it).

Note that this fixes also the case where we were around limits of u64, but
that was not the regression reported.

Full disclosure: I remember suggesting this particular piece of code to
Ben a few years ago, so blame on me.

Fixes: 2fd5e433cd26 ("ice: Accumulate HW and Netdev statistics over reset")
Reported-by: Nebojsa Stevanovic <nebojsa.stevanovic@xxxxxxxxx>
Link: https://lore.kernel.org/intel-wired-lan/VI1PR02MB439744DEDAA7B59B9A2833FE912EA@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Reported-by: Christian Rohmann <christian.rohmann@xxxxxxxxx>
Link: https://lore.kernel.org/intel-wired-lan/f38a6ca4-af05-48b1-a3e6-17ef2054e525@xxxxxxxxx
Reviewed-by: Jacob Keller <jacob.e.keller@xxxxxxxxx>
Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@xxxxxxxxx>
Reviewed-by: Simon Horman <horms@xxxxxxxxxx>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@xxxxxxxxx> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@xxxxxxxxx>
Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>
---
drivers/net/ethernet/intel/ice/ice_main.c | 24 +++++++++++------------
1 file changed, 11 insertions(+), 13 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
index a9cca2d24120a..dabf33cec3e1b 100644
--- a/drivers/net/ethernet/intel/ice/ice_main.c
+++ b/drivers/net/ethernet/intel/ice/ice_main.c
@@ -6572,6 +6572,7 @@ static void ice_update_vsi_ring_stats(struct ice_vsi *vsi)
{
struct rtnl_link_stats64 *net_stats, *stats_prev;
struct rtnl_link_stats64 *vsi_stats;
+ struct ice_pf *pf = vsi->back;
u64 pkts, bytes;
int i;

@@ -6617,21 +6618,18 @@ static void ice_update_vsi_ring_stats(struct ice_vsi *vsi)
net_stats = &vsi->net_stats;
stats_prev = &vsi->net_stats_prev;

- /* clear prev counters after reset */
- if (vsi_stats->tx_packets < stats_prev->tx_packets ||
- vsi_stats->rx_packets < stats_prev->rx_packets) {
- stats_prev->tx_packets = 0;
- stats_prev->tx_bytes = 0;
- stats_prev->rx_packets = 0;
- stats_prev->rx_bytes = 0;
+ /* Update netdev counters, but keep in mind that values could start at
+ * random value after PF reset. And as we increase the reported stat by
+ * diff of Prev-Cur, we need to be sure that Prev is valid. If it's not,
+ * let's skip this round.
+ */
+ if (likely(pf->stat_prev_loaded)) {
+ net_stats->tx_packets += vsi_stats->tx_packets - stats_prev->tx_packets;
+ net_stats->tx_bytes += vsi_stats->tx_bytes - stats_prev->tx_bytes;
+ net_stats->rx_packets += vsi_stats->rx_packets - stats_prev->rx_packets;
+ net_stats->rx_bytes += vsi_stats->rx_bytes - stats_prev->rx_bytes;
}

- /* update netdev counters */
- net_stats->tx_packets += vsi_stats->tx_packets - stats_prev->tx_packets;
- net_stats->tx_bytes += vsi_stats->tx_bytes - stats_prev->tx_bytes;
- net_stats->rx_packets += vsi_stats->rx_packets - stats_prev->rx_packets;
- net_stats->rx_bytes += vsi_stats->rx_bytes - stats_prev->rx_bytes;
-
stats_prev->tx_packets = vsi_stats->tx_packets;
stats_prev->tx_bytes = vsi_stats->tx_bytes;
stats_prev->rx_packets = vsi_stats->rx_packets;
--
2.43.0