Re: [PATCH RFC v1 1/2] rcu/tree: Add basic support for kfree_rcu batching

From: Byungchul Park
Date: Thu Aug 08 2019 - 05:54:02 EST


On Wed, Aug 07, 2019 at 10:52:15AM -0700, Paul E. McKenney wrote:
> On Wed, Aug 07, 2019 at 05:45:04AM -0400, Joel Fernandes wrote:
> > On Tue, Aug 06, 2019 at 04:56:31PM -0700, Paul E. McKenney wrote:
> > > On Tue, Aug 06, 2019 at 05:20:40PM -0400, Joel Fernandes (Google) wrote:
>
> [ . . . ]
>
> > > > + for (; head; head = next) {
> > > > + next = head->next;
> > > > + head->next = NULL;
> > > > + __call_rcu(head, head->func, -1, 1);
> > >
> > > We need at least a cond_resched() here. 200,000 times through this loop
> > > in a PREEMPT=n kernel might not always be pretty. Except that this is
> > > invoked directly from kfree_rcu() which might be invoked with interrupts
> > > disabled, which precludes calls to cond_resched(). So the realtime guys
> > > are not going to be at all happy with this loop.
> >
> > Ok, will add this here.
> >
> > > And this loop could be avoided entirely by having a third rcu_head list
> > > in the kfree_rcu_cpu structure. Yes, some of the batches would exceed
> > > KFREE_MAX_BATCH, but given that they are invoked from a workqueue, that
> > > should be OK, or at least more OK than queuing 200,000 callbacks with
> > > interrupts disabled. (If it turns out not to be OK, an array of rcu_head
> > > pointers can be used to reduce the probability of oversized batches.)
> > > This would also mean that the equality comparisons with KFREE_MAX_BATCH
> > > need to become greater-or-equal comparisons or some such.
> >
> > Yes, certainly we can do these kinds of improvements after this patch, and
> > then add more tests to validate the improvements.
>
> Out of pity for people bisecting, we need this fixed up front.
>
> My suggestion is to just allow ->head to grow until ->head_free becomes
> available. That way you are looping with interrupts and preemption
> enabled in workqueue context, which is much less damaging than doing so
> with interrupts disabled, and possibly even from hard-irq context.

Agree.

Or after introducing another limit like KFREE_MAX_BATCH_FORCE(>=
KFREE_MAX_BATCH):

1. Try to drain it on hitting KFREE_MAX_BATCH as it does.

On success: Same as now.
On fail: let ->head grow and drain if possible, until reaching to
KFREE_MAX_BATCH_FORCE.

3. On hitting KFREE_MAX_BATCH_FORCE, give up batching but handle one by
one from now on to prevent too many pending requests from being
queued for batching work.

This way, we can avoid both:

1. too many requests being queued and
2. __call_rcu() bunch of requests within a single kfree_rcu().

Thanks,
Byungchul

>
> But please feel free to come up with a better solution!
>
> [ . . . ]
>
> > > > @@ -3459,6 +3645,8 @@ void __init rcu_init(void)
> > > > {
> > > > int cpu;
> > > >
> > > > + kfree_rcu_batch_init();
> > >
> > > What happens if someone does a kfree_rcu() before this point? It looks
> > > like it should work, but have you tested it?
> > >
> > > > rcu_early_boot_tests();
> > >
> > > For example, by testing it in rcu_early_boot_tests() and moving the
> > > call to kfree_rcu_batch_init() here.
> >
> > I have not tried to do the kfree_rcu() this early. I will try it out.
>
> Yeah, well, call_rcu() this early came as a surprise to me back in the
> day, so... ;-)
>
> Thanx, Paul