Re: rcu_preempt self-detected stall on CPU from 4.5-rc3, since 3.17

From: Mathieu Desnoyers
Date: Thu Feb 18 2016 - 18:14:33 EST


----- On Feb 18, 2016, at 6:51 AM, Ross Green rgkernel@xxxxxxxxx wrote:

> On Thu, Feb 18, 2016 at 10:19 AM, Paul E. McKenney
> <paulmck@xxxxxxxxxxxxxxxxxx> wrote:
>> On Wed, Feb 17, 2016 at 12:28:29PM -0800, Paul E. McKenney wrote:
>>> On Wed, Feb 17, 2016 at 08:45:54PM +0100, Peter Zijlstra wrote:
>>> > On Wed, Feb 17, 2016 at 11:28:17AM -0800, Paul E. McKenney wrote:
>>> > > On Tue, Feb 16, 2016 at 09:45:49PM -0800, Paul E. McKenney wrote:
>>> > > > On Tue, Feb 09, 2016 at 09:11:55PM +1100, Ross Green wrote:
>>> > > > > Continued testing with the latest linux-4.5-rc3 release.
>>> > > > >
>>> > > > > Please find attached a copy of traces from dmesg:
>>> > > > >
>>> > > > > There is a lot more debug and trace data so hopefully this will shed
>>> > > > > some light on what might be happening here.
>>> > > > >
>>> > > > > My testing remains run a series of simple benchmarks, let that run to
>>> > > > > completion and then leave the system idle away with just a few daemons
>>> > > > > running.
>>> > > > >
>>> > > > > the self detected stalls in this instance turned up after a days run time.
>>> > > > > There were NO heavy artificial computational loads on the machine.
>>> > > >
>>> > > > It does indeed look quiet on that dmesg for a good long time.
>>> > > >
>>> > > > The following insanely crude not-for-mainline hack -might- be producing
>>> > > > good results in my testing. It will take some time before I can claim
>>> > > > statistically different results. But please feel free to give it a go
>>> > > > in the meantime. (Thanks to Al Viro for pointing me in this direction.)
>>> >
>>> > Your case was special in that is was hotplug triggering it, right?
>>>
>>> Yes, it has thus far only shown up with CPU hotplug enabled.
>>>
>>> > I was auditing the hotplug paths involved when I fell ill two weeks ago,
>>> > and have not really made any progress on that because of that :/
>>>
>>> I have always said that being sick is bad for one's health, but I didn't
>>> realize that it could be bad for the kernel's health as well. ;-)
>>>
>>> > I'll go have another look, I had a vague feeling for a race back then,
>>> > lets see if I can still remember how..
>>>
>>> I believe that I can -finally- get an ftrace_dump() to happen within
>>> 10-20 milliseconds of the problem, which just might be soon enough
>>> after the problem to gather some useful information. I am currently
>>> testing this theory with "ftrace trace_event=sched_waking,sched_wakeup"
>>> boot arguments on a two-hour run.
>>
>> And apparently another way to greatly reduce the probability of this
>> bug occurring is to enable ftrace. :-/
>>
>> Will try longer runs.
>>
>> Thanx, Paul
>>
>>> If this works out, what would be a useful set of trace events for me
>>> to capture?
>>>
>>> Thanx, Paul
>>
>
> Well managed to catch this one on linux-4.5-rc4.
>
> Took over 3 days and 7 hours to hit.
>
> Same test as before, boot, run a series of simple benchmarks and then
> let the machine just idle away.
>
> As before, the reported stall, AND everything keeps on running as if
> nothing had happened.
>
> I notice in the task dump for both the cpus, the swapper is running on
> both cpus.
>
> does that make any sense?
> There is around 3% of memory actually used.
>
> Anyway, please find attached a copy of the dmesg output.
>
> Hope this helps a few people fill in the missing pieces here.

What seems weird here is that all code paths in the loop
perform a WRITE_ONCE(rsp->gp_activity, jiffies), which
implies progress in each case:

- rcu_gp_init() does it,
- both branches in the QS forcing loop do it, either
through rcu_gp_fqs(), or directly,

This means the thread is really stalled, and the backtrace
shows those threads are stalled on the

ret = wait_event_interruptible_timeout(rsp->gp_wq,
rcu_gp_fqs_check_wake(rsp, &gf), j);

Since this is a *_timeout wait, for which the timeout
is bounded by "j" jiffies which is bounded by "HZ" value,
we should really not stay there too long, even if we are
not awakened by whatever is supposed to awaken us.

So unless I'm missing something, it would look like
schedule_timeout() is missing its timeout there.

Perhaps we only experience this missed timeout here
because typically there is always a wakeup coming sooner
or later on relatively busy systems. This one is idle
for quite a while.

Thoughts ?

Thanks,

Mathieu


>
> Regards,
>
> Ross Green

--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

Attachment: dmesg-4.5-rc4
Description: Binary data