[RFC net-next v3 0/2] mlx5: Add netdev-genl queue stats

From: Joe Damato
Date: Tue May 28 2024 - 23:21:03 EST


Greetings:

Switching to an RFC instead of a PATCH because even though Tariq
patiently explained the code to me, I'm sure I probably still missed
something ;)

If this turns out to be right and Tariq agrees, I can send a PATCH
net-next v4.

This change adds support for the per queue netdev-genl API to mlx5,
which seems to output stats:

/cli.py --spec ../../../Documentation/netlink/specs/netdev.yaml \
--dump qstats-get --json '{"scope": "queue"}'

..snip
{'ifindex': 7,
'queue-id': 28,
'queue-type': 'tx',
'tx-bytes': 399462,
'tx-packets': 3311},
..snip

I've used the suggested tooling to verify the per queue stats match
rtnl by doing this:

NETIF=eth0 tools/testing/selftests/drivers/net/stats.py

I've tested the following scenarios:
- The machine at boot (default queue configuration)
- Adjusting the queue configuration to various amounts via ethtool
- Add mqprio TCs
- Removing the mqprio TCs

and in each scenario the stats script above reports that the stats match
rtnl.

Worth noting that Tariq suggested I also export HTB/QOS stats in
mlx5e_get_base_stats.

I am open to doing this, but I think if I were to do that, HTB/QOS queue
stats should also be exported by rtnl so that the script above will
continue to show that the output is correct.

I'd like to propose: adding HTB/QOS to both rtnl *and* the netdev-genl
code together at the same time, but a later time, separate from this
change.

Thanks,
Joe

v2 -> rfcv3:
- Added patch 1/2 which creates some helpers for computing the txq_ix
and ch_ix/tc_ix.

- Patch 2/2 modified in several ways:
- Fixed variable declarations in mlx5e_get_queue_stats_rx to be at
the start of the function.
- mlx5e_get_queue_stats_tx rewritten to access sq stats directly by
using the helpers added in the previous patch.
- mlx5e_get_base_stats modified in several ways:
- Took the state_lock when accessing priv->channels.
- For the base RX stats, code was simplified to call
mlx5e_get_queue_stats_rx instead of repeating the same code.
- For the base TX stats, I attempted to implement what I think
Tariq suggested in the previous thread:
- for available channels, only unavailable TC stats are summed
- for unavailable channels, all stats for TCs up to
max_opened_tc are summed.

v1 - > v2:
- Essentially a full rewrite after comments from Jakub, Tariq, and
Zhu.

Joe Damato (2):
net/mlx5e: Add helpers to calculate txq and ch idx
net/mlx5e: Add per queue netdev-genl stats

.../net/ethernet/mellanox/mlx5/core/en_main.c | 150 +++++++++++++++++-
1 file changed, 149 insertions(+), 1 deletion(-)

--
2.25.1