Re: [patch 0/4] timer/nohz: Fix timer/nohz woes

From: Paul E. McKenney
Date: Sat Dec 23 2017 - 20:29:30 EST


On Sat, Dec 23, 2017 at 05:21:20PM -0800, Paul E. McKenney wrote:
> On Fri, Dec 22, 2017 at 09:09:07AM -0800, Paul E. McKenney wrote:
> > On Fri, Dec 22, 2017 at 03:51:11PM +0100, Thomas Gleixner wrote:
> > > Paul was observing weird stalls which are hard to reproduce and decode. We
> > > were finally able to reproduce and decode the wreckage on RT.
> > >
> > > The following series addresses the issues and hopefully nails the root
> > > cause completely.
> > >
> > > Please review carefully and expose it to the dreaded rcu torture tests
> > > which seem to be the only way to trigger it.
> >
> > Best Christmas present ever, thank you!!!
> >
> > Just started up three concurrent 10-hour runs of the infamous rcutorture
> > TREE01 scenario, and will let you know how it goes!
>
> Well, I messed up the first test and then reran it. Which had the benefit
> of giving me a baseline. The rerun (with all four patches) produced
> failures, so I ran it again with an additional patch of mine. I score
> these tests by recording the time at first failure, or, if there is no
> failure, the duration of the test. Summing the values gives the score.
> And here are the scores, where 30 is a perfect score:

Sigh. They were five-hour tests, not ten-hour tests.

1. Baseline: 3.0+2.5+5=10.5

2. Four patches from Anna-Marie and Thomas: 5+2.7+1.7=9.4

3. Ditto plus the patch below: 5+4.3+5=14.3

Oh, and the reason for my suspecting that #2 is actually an improvement
over #1 is that my patch by itself produced a very small improvement in
reliability. This leads to the hypothesis that #2 really is helping out
in some way or another.

Thanx, Paul

> 1. Baseline: 3.0+2.5+10=15.5
>
> 2. Four patches from Anna-Marie and Thomas: 10+2.7+1.7=14.4
>
> 3. Ditto plus the patch below: 10+4.3+10=24.3
>
> Please note that these are nowhere near anything even resembling
> statistical significance. However, they are encouraging. I will do
> more runs, but also do shorter five-hour runs to increase the amount
> of data per unit time. Please note also that my patch by itself never
> did provide that great of an improvement, so there might be some sort
> of combination effect going on here. Or maybe it is just luck, who knows?
>
> Please note that I have not yet ported my diagnostic patches on top of
> these, however, the stacks have the usual schedule_timeout() entries.
> This is not too surprising from a software-engineering viewpoint:
> Locating several bugs at a given point of time usually indicates that
> there are more to be found. So in a sense we are lucky that the
> same test triggers at least one of those additional bugs.
>
> Thanx, Paul
>
> ------------------------------------------------------------------------
>
> commit accb0edb85526a05b934eac49658d05ea0216fc4
> Author: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>
> Date: Thu Dec 7 13:18:44 2017 -0800
>
> timers: Ensure that timer_base ->clk accounts for time offline
>
> The timer_base ->must_forward_clk is set to indicate that the next timer
> operation on that timer_base must check for passage of time. One instance
> of time passage is when the timer wheel goes idle, and another is when
> the corresponding CPU is offline. Note that it is not appropriate to set
> ->is_idle because that could result in IPIing an offline CPU. Therefore,
> this commit instead sets ->must_forward_clk at CPU-offline time.
>
> Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>
>
> diff --git a/kernel/time/timer.c b/kernel/time/timer.c
> index ffebcf878fba..94cce780c574 100644
> --- a/kernel/time/timer.c
> +++ b/kernel/time/timer.c
> @@ -1875,6 +1875,7 @@ int timers_dead_cpu(unsigned int cpu)
>
> BUG_ON(old_base->running_timer);
>
> + old_base->must_forward_clk = true;
> for (i = 0; i < WHEEL_SIZE; i++)
> migrate_timer_list(new_base, old_base->vectors + i);
>