Re: [PATCH] ksoftirqd: Enable IRQs and call cond_resched() before poking RCU

From: Paul E. McKenney
Date: Tue Jan 20 2015 - 15:31:27 EST


On Tue, Jan 20, 2015 at 02:21:51PM +0100, Thomas Gleixner wrote:
> On Tue, 13 Jan 2015, Calvin Owens wrote:
>
> > While debugging an issue with excessive softirq usage, I encountered the
> > following note in commit 3e339b5dae24a706 ("softirq: Use hotplug thread
> > infrastructure"):
> >
> > [ paulmck: Call rcu_note_context_switch() with interrupts enabled. ]
> >
> > ...but despite this note, the patch still calls RCU with IRQs disabled.
> >
> > This seemingly innocuous change caused a significant regression in softirq
> > CPU usage on the sending side of a large TCP transfer (~1 GB/s): when
> > introducing 0.01% packet loss, the softirq usage would jump to around 25%,
> > spiking as high as 50%. Before the change, the usage would never exceed 5%.
> >
> > Moving the call to rcu_note_context_switch() after the cond_sched() call,
> > as it was originally before the hotplug patch, completely eliminated this
> > problem.
> >
> > Signed-off-by: Calvin Owens <calvinowens@xxxxxx>
> > Cc: stable@xxxxxxxxxxxxxxx
> > ---
> > This version includes the "cpu" argument to rcu_note_context_switch() in
> > order to apply cleanly to stable kernels. It will need to be removed to
> > apply to 3.18+ and 3.19 (upstream commit 38200cf2 removed the argument).
> >
> > kernel/softirq.c | 6 +++++-
> > 1 file changed, 5 insertions(+), 1 deletion(-)
> >
> > diff --git a/kernel/softirq.c b/kernel/softirq.c
> > index 501baa9..9e787d8 100644
> > --- a/kernel/softirq.c
> > +++ b/kernel/softirq.c
> > @@ -656,9 +656,13 @@ static void run_ksoftirqd(unsigned int cpu)
> > * in the task stack here.
> > */
> > __do_softirq();
> > - rcu_note_context_switch(cpu);
> > local_irq_enable();
> > cond_resched();
> > +
> > + preempt_disable();
> > + rcu_note_context_switch(cpu);
> > + preempt_enable();
> > +
>
> The whole rcu_note_context_switch() in run_ksoftirqd() is silly.
>
> cond_resched()
> __preempt_count_add(PREEMPT_ACTIVE);
>
> __schedule();
> preempt_disable();
> rcu_note_context_switch();
> ....
>
> __preempt_count_sub(PREEMPT_ACTIVE);

I agree that if should_resched() returns true as assumed above, then there
is no point to invoking rcu_note_context_switch(). However, the case that
this code applies to is when should_resched() returns false, but RCU is
waiting for a quiescent state from the current CPU. In that case,
cond_resched() won't do anything for RCU, and we do need the
rcu_note_context_switch().

Of course, it would be better to avoid the extra RCU work in the common
case where cond_resched() does inovke the scheduler. And that is the
point of the following patch, which uses cond_resched_rcu_qs().
However, this use of cond_resched_rcu_qs() doesn't work in older
kernels. So Calvin's patch is for backporting, and the follow-up
patch for future kernels.

Make sense, or am I missing something?

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/