Re: [RFC-PATCH 1/2] mm: Add __GFP_NO_LOCKS flag

From: peterz
Date: Fri Aug 14 2020 - 12:19:21 EST


On Fri, Aug 14, 2020 at 07:14:25AM -0700, Paul E. McKenney wrote:

> Doing this to kfree_rcu() is the first step. We will also be doing this
> to call_rcu(), which has some long-standing invocations from various
> raw contexts, including hardirq handler.

Most hardirq handler are not raw on RT due to threaded interrupts.
Lockdep knows about this.

> > > 4) As kfree_rcu() can be used from any context except NMI and RT
> > > relies on it we ran into a circular dependency problem.
> >
> > Well, which actual usage sites are under a raw spinlock? Most of the
> > ones I could find are not.
>
> There are some on their way in, but this same optimization will be applied
> to call_rcu(), and there are no shortage of such call sites in that case.
> And these call sites have been around for a very long time.

Luckily there is lockdep to help you find the ones that need converting
to raw_call_rcu() :-)

> > > Clearly the simplest solution but not Pauls favourite and
> > > unfortunately he has a good reason.
> >
> > Which isn't actually stated anywhere I suppose ?
>
> Several times, but why not one more? ;-)
>
> In CONFIG_PREEMPT_NONE=y kernels, which are heavily used, and for which
> the proposed kfree_rcu() and later call_rcu() optimizations are quite
> important, there is no way to tell at runtime whether or you are in
> atomic raw context.

CONFIG_PREEMPT_NONE has nothing what so ever to do with any of this.

> > > > 2. Adding a GFP_ flag.
> > >
> > > Michal does not like the restriction for !RT kernels and tries to
> > > avoid the introduction of a new allocation mode.
> >
> > Like above, I tend to be with Michal on this, just wrap the actual
> > allocation in CONFIG_PREEMPT_RT, the code needs to handle a NULL pointer
> > there anyway.
>
> That disables the optimization in the CONFIG_PREEMPT_NONE=y case,
> where it really is needed.

No, it disables it for CONFIG_PREEMPT_RT.

> I would be OK with either. In CONFIG_PREEMPT_NONE=n kernels, the
> kfree_rcu() code could use preemptible() to determine whether it was safe
> to invoke the allocator. The code in kfree_rcu() might look like this:
>
> mem = NULL;
> if (IS_ENABLED(CONFIG_PREEMPT_NONE) || preemptible())
> mem = __get_free_page(...);
>
> Is your point is that the usual mistakes would then be caught by the
> usual testing on CONFIG_PREEMPT_NONE=n kernels?

mem = NULL;
#if !defined(CONFIG_PREEMPT_RT) && !defined(CONFIG_PROVE_LOCKING)
mem = __get_free_page(...)
#endif
if (!mem)

But I _really_ would much rather have raw_kfree_rcu() and raw_call_rcu()
variants for the few places that actually need it.

> > > These are not seperate of each other as #3 requires #4. The most
> > > horrible solution IMO from a technical POV as it proliferates
> > > inconsistency for no good reaosn.
> > >
> > > Aside of that it'd be solving a problem which does not exist simply
> > > because kfree_rcu() does not depend on it and we really don't want to
> > > set precedence and encourage the (ab)use of this in any way.
> >
> > My preferred solution is 1, if you want to use an allocator, you simply
> > don't get to play under raw_spinlock_t. And from a quick grep, most
> > kfree_rcu() users are not under raw_spinlock_t context.
>
> There is at least one on its way in, but more to the point, we will
> need to apply this same optimization to call_rcu(), which is used in

There is no need, call_rcu() works perfectly fine today, thank you.
You might want to, but that's an entirely different thing.

> raw atomic context, including from hardirq handlers.

Threaded IRQs. There really is very little code that is 'raw' on RT.