[RFC net-next v4 0/2] mlx5: Add netdev-genl queue stats

From: Joe Damato
Date: Mon Jun 03 2024 - 20:46:50 EST


Greetings:

Welcome to rfc v4.

Significant rewrite from v3 and hopefully getting closer to correctly
exporting per queue stats from mlx5. Please see changelog below for
detailed changes, especially regarding PTP stats.

Note that my NIC does not seem to support PTP and I couldn't get the
mlnx-tools mlnx_qos script to work, so I was only able to test the
following cases:

- device up at booot
- adjusting queue counts
- device down (e.g. ip link set dev eth4 down)

Please see the commit message of patch 2/2 for more details on output
and test cases.

v3 thread: https://lore.kernel.org/lkml/20240601113913.GA696607@xxxxxxxxxx/T/

Thanks,
Joe

rfcv3 -> rfcv4:
- Patch 1/2 now creates a mapping (priv->txq2sq_stats) which maps txq
indices to sq_stats structures so stats can be accessed directly.
This mapping is kept up to date along side txq2sq.

- Patch 2/2:
- All mutex_lock/unlock on state_lock has been dropped.
- mlx5e_get_queue_stats_rx now uses ASSERT_RTNL() and has a special
case for PTP. If PTP was ever opened, is currently opened, and the
channel index matches, stats for PTP RX are output.
- mlx5e_get_queue_stats_tx rewritten to use priv->txq2sq_stats. No
corner cases are needed here because any txq idx (passed in as i)
will have an up to date mapping in priv->txq2sq_stats.
- mlx5e_get_base_stats:
- in the RX case:
- iterates from [params.num_channels, stats_nch) collecting
stats.
- if ptp was ever opened but is currently closed, add the PTP
stats.
- in the TX case:
- handle 2 cases:
- the channel is available, so sum only the unavailable TCs
[mlx5e_get_dcb_num_tc, max_opened_tc).
- the channel is unavailable, so sum all TCs [0, max_opened_tc).
- if ptp was ever opened but is currently closed, add the PTP
sq stats.

v2 -> rfcv3:
- Added patch 1/2 which creates some helpers for computing the txq_ix
and ch_ix/tc_ix.

- Patch 2/2 modified in several ways:
- Fixed variable declarations in mlx5e_get_queue_stats_rx to be at
the start of the function.
- mlx5e_get_queue_stats_tx rewritten to access sq stats directly by
using the helpers added in the previous patch.
- mlx5e_get_base_stats modified in several ways:
- Took the state_lock when accessing priv->channels.
- For the base RX stats, code was simplified to call
mlx5e_get_queue_stats_rx instead of repeating the same code.
- For the base TX stats, I attempted to implement what I think
Tariq suggested in the previous thread:
- for available channels, only unavailable TC stats are summed
- for unavailable channels, all stats for TCs up to
max_opened_tc are summed.

v1 - > v2:
- Essentially a full rewrite after comments from Jakub, Tariq, and
Zhu.

Joe Damato (2):
net/mlx5e: Add txq to sq stats mapping
net/mlx5e: Add per queue netdev-genl stats

drivers/net/ethernet/mellanox/mlx5/core/en.h | 2 +
.../net/ethernet/mellanox/mlx5/core/en/qos.c | 13 +-
.../net/ethernet/mellanox/mlx5/core/en_main.c | 149 +++++++++++++++++-
3 files changed, 161 insertions(+), 3 deletions(-)

--
2.25.1