Re: [RFC v2] rcu/tree: Try to invoke_rcu_core() if in_irq() during unlock

From: Joel Fernandes
Date: Mon Aug 19 2019 - 21:40:35 EST


On Mon, Aug 19, 2019 at 07:14:38PM -0500, Scott Wood wrote:
> On Sun, 2019-08-18 at 17:49 -0400, Joel Fernandes (Google) wrote:
> > When we're in hard interrupt context in rcu_read_unlock_special(), we
> > can still benefit from invoke_rcu_core() doing wake ups of rcuc
> > threads when the !use_softirq parameter is passed. This is safe
> > to do so because:
>
> What is the benefit, beyond skipping the irq work overhead? Is there some
> reason to specifically want the rcuc thread woken rather than just getting
> into the scheduler (and thus rcu_note_context_switch) as soon as possible?

Isn't skipping irq work overhead enough of a benefit?

Anyway, I think it is useful in this scenario:
Consider exp==true when the rcu_read_unlock() is done on a nohz_full CPU.

If you simply 'get into the scheduler' as you pointed, that is not enough to
end the grace period. The quiescent state has to be reported up the tree and
propagated to the root node in the tree. This happens only in 2 places:
1. The scheduler tick raising softirq, the end of which will execute the
RCU core from the softirq or do the invoke_rcu_core().
2. The FQS loop which needs to see a dyntick idle transition on the
CPU (usermode/idle to kernel or viceversa).

Case 1. is unlikely since the tick may be turned off but I worked last week
with Paul on turning it on and is doing better.
Case 2. is not happening if we're looping in kernel mode.

In this scenario, calling invoke_rcu_core() directly is better than
scheduling the IRQ work. I don't think the IRQ work will do anything for
nohz_full CPU but I am not sure about that.

To give more background about why I arrived at this patch, I noticed that
this call to invoke_rcu_core() was already being done but it was removed
because the commit removing it said that it is pointless as it does not do
anything. But I think it does do something, that's why I introduced it back.
The rcu_read_unlock_special() is a slow path anyway so one more branch should
be harmless and actually could be beneficial. However, this is just RFC,
please treat it as such. I am running more tests on it based on Paul's
suggestions and looking more closely at it tomorrow.

Thanks!

- Joel