Re: [External] Re: [PATCH net] veth: fix ethtool statistical errors

From: Lorenzo Bianconi
Date: Mon Nov 20 2023 - 05:55:56 EST


> Lorenzo Bianconi <lorenzo@xxxxxxxxxx> 于2023年11月20日周一 17:52写道:
> >
> > > Lorenzo Bianconi <lorenzo@xxxxxxxxxx> 于2023年11月17日周五 17:26写道:
> > > >
> > > > > if peer->real_num_rx_queues > 1, the ethtool -s command for
> > > > > veth network device will display some error statistical values.
> > > > > The value of tx_idx is reset with each iteration, so even if
> > > > > peer->real_num_rx_queues is greater than 1, the value of tx_idx
> > > > > will remain constant. This results in incorrect statistical values.
> > > > > To fix this issue, assign the value of pp_idx to tx_idx.
> > > > >
> > > > > Fixes: 5fe6e56776ba ("veth: rely on peer veth_rq for ndo_xdp_xmit accounting")
> > > > > Signed-off-by: Albert Huang <huangjie.albert@xxxxxxxxxxxxx>
> > > > > ---
> > > > > drivers/net/veth.c | 2 +-
> > > > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/drivers/net/veth.c b/drivers/net/veth.c
> > > > > index 0deefd1573cf..3a8e3fc5eeb5 100644
> > > > > --- a/drivers/net/veth.c
> > > > > +++ b/drivers/net/veth.c
> > > > > @@ -225,7 +225,7 @@ static void veth_get_ethtool_stats(struct net_device *dev,
> > > > > for (i = 0; i < peer->real_num_rx_queues; i++) {
> > > > > const struct veth_rq_stats *rq_stats = &rcv_priv->rq[i].stats;
> > > > > const void *base = (void *)&rq_stats->vs;
> > > > > - unsigned int start, tx_idx = idx;
> > > > > + unsigned int start, tx_idx = pp_idx;
> > > > > size_t offset;
> > > > >
> > > > > tx_idx += (i % dev->real_num_tx_queues) * VETH_TQ_STATS_LEN;
> > > > > --
> > > > > 2.20.1
> > > > >
> > > >
> > > > Hi Albert,
> > > >
> > > > Can you please provide more details about the issue you are facing?
> > > > In particular, what is the number of configured tx and rx queues for both
> > > > peers?
> > >
> > > Hi, Lorenzo
> > > I found this because I wanted to add more echo information in ethttool(for veth,
> > > but I found that the information was incorrect. That's why I paid
> > > attention here.
> >
> > ack. Could you please share the veth pair tx/rx queue configuration?
> >
>
> dev: tx --->4. rx--->4
> peer: tx--->1 rx---->1
>
> Could the following code still be problematic? pp_idx not updated correctly.
> page_pool_stats:
> veth_get_page_pool_stats(dev, &data[pp_idx]);

Thx for pointing this out. This part is a bit tricky but I think I can see the
issue now. Since we have just one peer rx queue, when we run ndo_xdp_xmit
pointer on dev, we will squash all dev xmit queues on the single peer rx one
(where we do do the accounting) [0].
The issue is ethtool will display all dev xmit queues so we need to set pp_idx
properly in veth_get_ethtool_stats().
Can you please take a look to the patch below?

Regards,
Lorenzo

[0] https://github.com/LorenzoBianconi/net-next/blob/master/drivers/net/veth.c#L417

diff --git a/drivers/net/veth.c b/drivers/net/veth.c
index 9980517ed8b0..8607eb8cf458 100644
--- a/drivers/net/veth.c
+++ b/drivers/net/veth.c
@@ -236,8 +236,8 @@ static void veth_get_ethtool_stats(struct net_device *dev,
data[tx_idx + j] += *(u64 *)(base + offset);
}
} while (u64_stats_fetch_retry(&rq_stats->syncp, start));
- pp_idx = tx_idx + VETH_TQ_STATS_LEN;
}
+ pp_idx = idx + dev->real_num_tx_queues * VETH_TQ_STATS_LEN;

page_pool_stats:
veth_get_page_pool_stats(dev, &data[pp_idx]);

>
> BR
> Albert
>
> > Rergards,
> > Lorenzo
> >
> > >
> > > > tx_idx is the index of the current (local) tx queue and it must restart from
> > > > idx in each iteration otherwise we will have an issue when
> > > > peer->real_num_rx_queues is greater than dev->real_num_tx_queues.
> > > >
> > > OK. I don't know if this is a known issue.
> > >
> > > BR
> > > Albert
> > >
> > >
> > > > Regards,
> > > > Lorenzo

Attachment: signature.asc
Description: PGP signature