Re: [PATCH net] netdevsim: disable local BH when scheduling NAPI

From: Breno Leitao
Date: Fri Feb 14 2025 - 08:09:43 EST


Hello Eric,

On Wed, Feb 12, 2025 at 07:55:32PM +0100, Eric Dumazet wrote:
> On Wed, Feb 12, 2025 at 7:34 PM Breno Leitao <leitao@xxxxxxxxxx> wrote:
> >
> > --- a/drivers/net/netdevsim/netdev.c
> > +++ b/drivers/net/netdevsim/netdev.c
> > @@ -87,7 +87,9 @@ static netdev_tx_t nsim_start_xmit(struct sk_buff *skb, struct net_device *dev)
> > if (unlikely(nsim_forward_skb(peer_dev, skb, rq) == NET_RX_DROP))
> > goto out_drop_cnt;
> >
> > + local_bh_disable();
> > napi_schedule(&rq->napi);
> > + local_bh_enable();
> >
>
> I thought all ndo_start_xmit() were done under local_bh_disable()

I think it depends on the path?

> Could you give more details ?

There are several paths to ndo_start_xmit(), and please correct me if
I am reading the code wrongly here.

Common path:

__dev_direct_xmit()
local_bh_disable();
netdev_start_xmit()
__netdev_start_xmit()
ops->ndo_start_xmit(skb, dev);


But, in some other cases, I see:

netpoll_start_xmit()
netdev_start_xmit()
....

My reading is that not all cases have local_bh_disable() disabled before
calling ndo_start_xmit().

Question: Must BH be disabled before calling ndo_start_xmit()? If so,
the problem might be in the netpoll code!? Also, is it worth adding
a DEBUG_NET_WARN_ON_ONCE()?

Note: Jakub gave another suggestion on how to fix this, so, I send a v2
with a different approach:

https://lore.kernel.org/all/20250213071426.01490615@xxxxxxxxxx/

Thanks for the review!
--breno