Re: [PATCH] net: raise RCU qs after each threaded NAPI poll

From: Paul E. McKenney
Date: Wed Feb 28 2024 - 16:13:21 EST


On Wed, Feb 28, 2024 at 03:14:34PM -0500, Joel Fernandes wrote:
> On Wed, Feb 28, 2024 at 12:18 PM Paul E. McKenney <paulmck@xxxxxxxxxx> wrote:
> >
> > On Wed, Feb 28, 2024 at 10:37:51AM -0600, Yan Zhai wrote:
> > > On Wed, Feb 28, 2024 at 9:37 AM Joel Fernandes <joel@xxxxxxxxxxxxxxxxx> wrote:
> > > > Also optionally, I wonder if calling rcu_tasks_qs() directly is better
> > > > (for documentation if anything) since the issue is Tasks RCU specific. Also
> > > > code comment above the rcu_softirq_qs() call about cond_resched() not taking
> > > > care of Tasks RCU would be great!
> > > >
> > > Yes it's quite surprising to me that cond_resched does not help here,
> >
> > In theory, it would be possible to make cond_resched() take care of
> > Tasks RCU. In practice, the lazy-preemption work is looking to get rid
> > of cond_resched(). But if for some reason cond_resched() needs to stay
> > around, doing that work might make sense.
>
> In my opinion, cond_resched() doing Tasks-RCU QS does not make sense
> (to me), because cond_resched() is to inform the scheduler to run
> something else possibly of higher priority while the current task is
> still runnable. On the other hand, what's not permitted in a Tasks RCU
> reader is a voluntary sleep. So IMO even though cond_resched() is a
> voluntary call, it is still not a sleep but rather a preemption point.

>From the viewpoint of Task RCU's users, the point is to figure out
when it is OK to free an already-removed tracing trampoline. The
current Task RCU implementation relies on the fact that tracing
trampolines do not do voluntary context switches.

> So a Tasks RCU reader should perfectly be able to be scheduled out in
> the middle of a read-side critical section (in current code) by
> calling cond_resched(). It is just like involuntary preemption in the
> middle of a RCU reader, in disguise, Right?

You lost me on this one. This for example is not permitted:

rcu_read_lock();
cond_resched();
rcu_read_unlock();

But in a CONFIG_PREEMPT=y kernel, that RCU reader could be preempted.

So cond_resched() looks like a voluntary context switch to me. Recall
that vanilla non-preemptible RCU will treat them as quiescent states if
the grace period extends long enough.

What am I missing here?

Thanx, Paul