Re: [RFC net-next v3 2/2] net/mlx5e: Add per queue netdev-genl stats

From: Tariq Toukan
Date: Mon Jun 03 2024 - 17:36:36 EST




On 03/06/2024 22:22, Joe Damato wrote:
On Mon, Jun 03, 2024 at 02:11:14PM +0300, Tariq Toukan wrote:


...


I still don't really like this design, so I gave some more thought on
this...

I think we should come up with a new mapping array under priv, that maps i
(from real_num_tx_queues) to the matching sq_stats struct.
This array would be maintained in the channels open/close functions,
similarly to priv->txq2sq.

Then, we would not calculate the mapping per call, but just get the proper
pointer from the array. This eases the handling of htb and ptp queues, which
were missed in your txq_ix_to_chtc_ix().

Maybe I am just getting way off track here, but I noticed this in
drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c in
set_pflag_tx_port_ts:

if (!err)
priv->tx_ptp_opened = true;

but what if enable is false, meaning the user is disabling ptp via
ethtool?


This bool field under priv acts as a flag, means: ptp was ever opened.
It's the same idea as max_opened_tc, but with boolean instead of numeric.

In that case, shouldn't this be:

if (!err)
priv->tx_ptp_opened = enabled;

Because it seems like the user could pass 0 to disable ptp, and with
the current code then trigger mlx5e_close_channels which would call
mlx5e_ptp_close, but priv->tx_ptp_opened might still be true even
though ptp has been closed. Just a guess as I really don't know much
about this code at all, but it seems that the MLX5E_PFLAG_TX_PORT_TS
will be set in set_pflag_tx_port_ts based on enable, which then
affects how mlx5e_open_channels behaves.

Likewise in mlx5e_ptp_close_queues, maybe
rx_ptp_opened and tx_ptp_opened should be set to false there?

It seems like the user could get the driver into an inconsistent
state by enabling then disabling ptp, but maybe I'm just reading
this all wrong.

Is that a bug? If so, I can submit:

1. ethtool tx_ptp_opened = false
2. rx_ptp_opened = tx_ptp_opened = false in mlx5e_ptp_close_queues

to net as a Fixes ?

I am asking about this from the perspective of stats, because in the
qos stuff, the txq2sq_stats mapping is created or removed (set to
NULL) in mlx5e_activate_qos_sq and mlx5e_deactivate_qos_sq /
mlx5e_qos_deactivate_queues, respectively.

This means that priv->txq2sq_stats could be a valid sq_stats or
invalid for deactivated TCs and qos. It seems like ptp should work
the same way, but I have no idea.

If ptp did work the same way, then in base the code would check if
ptp was disabled and obtain the stats from it, if there are any from
it being previously enabled.

That seems like more consistent behavior, but sorry if
I'm totally off on all of this.