RE: [EXTERNAL] Re: [PATCH net v4 1/4] octeon_ep: fix race conditions in ndo_get_stats64

From: Shinas Rasheed
Date: Mon Jan 06 2025 - 00:57:54 EST


Hi Jakub,

> -----Original Message-----
> From: Jakub Kicinski <kuba@xxxxxxxxxx>
> Sent: Saturday, January 4, 2025 10:31 PM
> To: Shinas Rasheed <srasheed@xxxxxxxxxxx>
> Cc: netdev@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; Haseeb Gani
> <hgani@xxxxxxxxxxx>; Sathesh B Edara <sedara@xxxxxxxxxxx>; Vimlesh
> Kumar <vimleshk@xxxxxxxxxxx>; thaller@xxxxxxxxxx; wizhao@xxxxxxxxxx;
> kheib@xxxxxxxxxx; konguyen@xxxxxxxxxx; horms@xxxxxxxxxx;
> einstein.xue@xxxxxxxxxx; Veerasenareddy Burru <vburru@xxxxxxxxxxx>;
> Andrew Lunn <andrew+netdev@xxxxxxx>; David S. Miller
> <davem@xxxxxxxxxxxxx>; Eric Dumazet <edumazet@xxxxxxxxxx>; Paolo
> Abeni <pabeni@xxxxxxxxxx>; Abhijit Ayarekar <aayarekar@xxxxxxxxxxx>;
> Satananda Burla <sburla@xxxxxxxxxxx>
> Subject: [EXTERNAL] Re: [PATCH net v4 1/4] octeon_ep: fix race conditions in
> ndo_get_stats64
>
> On Thu, 2 Jan 2025 03: 22: 43 -0800 Shinas Rasheed wrote: > diff --git
> a/drivers/net/ethernet/marvell/octeon_ep/octep_main. c
> b/drivers/net/ethernet/marvell/octeon_ep/octep_main. c > index
> 549436efc204. . a452ee3b9a98 100644 > ---
> a/drivers/net/ethernet/marvell/octeon_ep/octep_main. c
> On Thu, 2 Jan 2025 03:22:43 -0800 Shinas Rasheed wrote:
> > diff --git a/drivers/net/ethernet/marvell/octeon_ep/octep_main.c
> b/drivers/net/ethernet/marvell/octeon_ep/octep_main.c
> > index 549436efc204..a452ee3b9a98 100644
> > --- a/drivers/net/ethernet/marvell/octeon_ep/octep_main.c
> > +++ b/drivers/net/ethernet/marvell/octeon_ep/octep_main.c
> > @@ -995,16 +995,14 @@ static void octep_get_stats64(struct net_device
> *netdev,
> > struct octep_device *oct = netdev_priv(netdev);
> > int q;
> >
> > - if (netif_running(netdev))
> > - octep_ctrl_net_get_if_stats(oct,
> > - OCTEP_CTRL_NET_INVALID_VFID,
> > - &oct->iface_rx_stats,
> > - &oct->iface_tx_stats);
> > -
> > tx_packets = 0;
> > tx_bytes = 0;
> > rx_packets = 0;
> > rx_bytes = 0;
> > +
> > + if (!netif_running(netdev))
> > + return;
>
> So we'll provide no stats when the device is down? That's not correct.
> The driver should save the stats from the freed queues (somewhere in
> the oct structure). Also please mention how this is synchronized
> against netif_running() changing its state, device may get closed while
> we're running..

I ACK the 'save stats from freed queues and emit out stats when device is down'.

About the synchronization, the reason I changed to simple netif_running check was to avoid
locks (as per previous patch version comments). Please do correct me if I'm wrong, but isn't the case
you mentioned protected by the rtnl_lock held by the netdev stack when it calls the ndo_op ?

> --
> pw-bot: cr