Re: [PATCH RFC v1 1/2] rcu/tree: Add basic support for kfree_rcu batching

From: Paul E. McKenney
Date: Sat Aug 10 2019 - 14:25:28 EST


On Sat, Aug 10, 2019 at 12:20:37AM -0400, Joel Fernandes wrote:
> On Fri, Aug 09, 2019 at 08:38:14PM -0700, Paul E. McKenney wrote:
> > On Fri, Aug 09, 2019 at 10:42:32PM -0400, Joel Fernandes wrote:
> > > On Wed, Aug 07, 2019 at 10:52:15AM -0700, Paul E. McKenney wrote:
> > > [snip]
> > > > > > > @@ -3459,6 +3645,8 @@ void __init rcu_init(void)
> > > > > > > {
> > > > > > > int cpu;
> > > > > > >
> > > > > > > + kfree_rcu_batch_init();
> > > > > >
> > > > > > What happens if someone does a kfree_rcu() before this point? It looks
> > > > > > like it should work, but have you tested it?
> > > > > >
> > > > > > > rcu_early_boot_tests();
> > > > > >
> > > > > > For example, by testing it in rcu_early_boot_tests() and moving the
> > > > > > call to kfree_rcu_batch_init() here.
> > > > >
> > > > > I have not tried to do the kfree_rcu() this early. I will try it out.
> > > >
> > > > Yeah, well, call_rcu() this early came as a surprise to me back in the
> > > > day, so... ;-)
> > >
> > > I actually did get surprised as well!
> > >
> > > It appears the timers are not fully initialized so the really early
> > > kfree_rcu() call from rcu_init() does cause a splat about an initialized
> > > timer spinlock (even though future kfree_rcu()s and the system are working
> > > fine all the way into the torture tests).
> > >
> > > I think to resolve this, we can just not do batching until early_initcall,
> > > during which I have an initialization function which switches batching on.
> > > >From that point it is safe.
> >
> > Just go ahead and batch, but don't bother with the timer until
> > after single-threaded boot is done. For example, you could check
> > rcu_scheduler_active similar to how sync_rcu_exp_select_cpus() does.
> > (See kernel/rcu/tree_exp.h.)
>
> Cool, that works nicely and I tested it. Actually I made it such that we
> don't need to batch even, before the scheduler is up. I don't see any benefit
> of that unless we can see a kfree_rcu() flood happening that early at boot
> which seems highly doubtful as a real world case.

The benefit is removing the kfree_rcu() special cases from the innards
of RCU, for example, in rcu_do_batch(). Another benefit is removing the
current restriction on the position of the rcu_head structure within the
enclosing data structure.

So it would be good to avoid the current kfree_rcu() special casing within
RCU itself.

Or are you using some trick that avoids both the batching and the current
kfree_rcu() special casing?

> > If needed, use an early_initcall() to handle the case where early boot
> > kfree_rcu() calls needed to set the timer but could not.
>
> And it would also need this complexity of early_initcall.

It would, but that (1) should not be all that complex, (2) only executes
the one time at boot rather than being an extra check on each callback,
and (3) is in memory that can be reclaimed (though I confess that I am
not sure how many architectures still do this).

> > > Below is the diff on top of this patch, I think this should be good but let
> > > me know if anything looks odd to you. I tested it and it works.
> >
> > Keep in mind that a call_rcu() callback can't possibly be invoked until
> > quite some time after the scheduler is up and running. So it will be
> > a lot simpler to just skip setting the timer during early boot.
>
> Sure. Skipping batching would skip the timer too :-D

Yes, but if I understand your approach correctly, it is unfortunately
not managing to avoid getting rid of the kfree_rcu() special casing.

> If in the future, batching is needed this early, then I am happy to add an
> early_initcall to setup the timer for any batched calls that could not setup
> the timer. Hope that is ok with you?

If you have some trick to nevertheless get rid of the current kfree_rcu()
special casing within RCU proper, sure!

Thanx, Paul