Re: [PATCH] net: raise RCU qs after each threaded NAPI poll

From: Yan Zhai
Date: Wed Feb 28 2024 - 10:57:43 EST


On Wed, Feb 28, 2024 at 9:35 AM Jakub Kicinski <kuba@xxxxxxxxxx> wrote:
>
> On Wed, 28 Feb 2024 07:15:42 -0800 Paul E. McKenney wrote:
> > > > Another complication is that although CONFIG_PREEMPT_RT kernels are
> > > > built with CONFIG_PREEMPT_RCU, the reverse is not always the case.
> > > > And if we are not repolling, don't we have a high probability of doing
> > > > a voluntary context when we reach napi_thread_wait() at the beginning
> > > > of that loop?
> > >
> > > Very much so, which is why adding the cost of rcu_softirq_qs()
> > > for every NAPI run feels like an overkill.
> >
> > Would it be better to do the rcu_softirq_qs() only once every 1000 times
> > or some such? Or once every HZ jiffies?
> >
> > Or is there a better way?
>
> Right, we can do that. Yan Zhai, have you measured the performance
> impact / time spent in the call?
For the case it hits the problem, the __napi_poll itself is usually
consuming much of the cycles, so I didn't notice any difference in
terms of tput. And it is in fact repolling all the time as the
customer traffic might not implement proper backoff. So using a loop
counter or jiffies to cap the number of invocations sounds like a
decent improvement.

Let me briefly check the overhead in the normal case, too

Yan