Re: rcu_preempt self-detected stall on CPU from 4.5-rc3, since 3.17

From: Ross Green
Date: Fri Feb 19 2016 - 00:59:17 EST

Next message: Eddie Huang: "[PATCH] ARM64: dts: Mediatek: mt8173-evb: fix access MMC fail issue"
Previous message: Sergey Senozhatsky: "Re: [RFC PATCH 3/3] mm/zsmalloc: change ZS_MAX_PAGES_PER_ZSPAGE"
In reply to: Paul E. McKenney: "Re: rcu_preempt self-detected stall on CPU from 4.5-rc3, since 3.17"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Fri, Feb 19, 2016 at 3:22 PM, Paul E. McKenney
<paulmck@xxxxxxxxxxxxxxxxxx> wrote:
> On Thu, Feb 18, 2016 at 11:14:21PM +0000, Mathieu Desnoyers wrote:
>> ----- On Feb 18, 2016, at 6:51 AM, Ross Green rgkernel@xxxxxxxxx wrote:
>>
>> > On Thu, Feb 18, 2016 at 10:19 AM, Paul E. McKenney
>> > <paulmck@xxxxxxxxxxxxxxxxxx> wrote:
>> >> On Wed, Feb 17, 2016 at 12:28:29PM -0800, Paul E. McKenney wrote:
>> >>> On Wed, Feb 17, 2016 at 08:45:54PM +0100, Peter Zijlstra wrote:
>> >>> > On Wed, Feb 17, 2016 at 11:28:17AM -0800, Paul E. McKenney wrote:
>> >>> > > On Tue, Feb 16, 2016 at 09:45:49PM -0800, Paul E. McKenney wrote:
>> >>> > > > On Tue, Feb 09, 2016 at 09:11:55PM +1100, Ross Green wrote:
>> >>> > > > > Continued testing with the latest linux-4.5-rc3 release.
>> >>> > > > >
>> >>> > > > > Please find attached a copy of traces from dmesg:
>> >>> > > > >
>> >>> > > > > There is a lot more debug and trace data so hopefully this will shed
>> >>> > > > > some light on what might be happening here.
>> >>> > > > >
>> >>> > > > > My testing remains run a series of simple benchmarks, let that run to
>> >>> > > > > completion and then leave the system idle away with just a few daemons
>> >>> > > > > running.
>> >>> > > > >
>> >>> > > > > the self detected stalls in this instance turned up after a days run time.
>> >>> > > > > There were NO heavy artificial computational loads on the machine.
>> >>> > > >
>> >>> > > > It does indeed look quiet on that dmesg for a good long time.
>> >>> > > >
>> >>> > > > The following insanely crude not-for-mainline hack -might- be producing
>> >>> > > > good results in my testing. It will take some time before I can claim
>> >>> > > > statistically different results. But please feel free to give it a go
>> >>> > > > in the meantime. (Thanks to Al Viro for pointing me in this direction.)
>> >>> >
>> >>> > Your case was special in that is was hotplug triggering it, right?
>> >>>
>> >>> Yes, it has thus far only shown up with CPU hotplug enabled.
>> >>>
>> >>> > I was auditing the hotplug paths involved when I fell ill two weeks ago,
>> >>> > and have not really made any progress on that because of that :/
>> >>>
>> >>> I have always said that being sick is bad for one's health, but I didn't
>> >>> realize that it could be bad for the kernel's health as well. ;-)
>> >>>
>> >>> > I'll go have another look, I had a vague feeling for a race back then,
>> >>> > lets see if I can still remember how..
>> >>>
>> >>> I believe that I can -finally- get an ftrace_dump() to happen within
>> >>> 10-20 milliseconds of the problem, which just might be soon enough
>> >>> after the problem to gather some useful information. I am currently
>> >>> testing this theory with "ftrace trace_event=sched_waking,sched_wakeup"
>> >>> boot arguments on a two-hour run.
>> >>
>> >> And apparently another way to greatly reduce the probability of this
>> >> bug occurring is to enable ftrace. :-/
>> >>
>> >> Will try longer runs.
>> >>
>> >> Thanx, Paul
>> >>
>> >>> If this works out, what would be a useful set of trace events for me
>> >>> to capture?
>> >>>
>> >>> Thanx, Paul
>> >>
>> >
>> > Well managed to catch this one on linux-4.5-rc4.
>> >
>> > Took over 3 days and 7 hours to hit.
>> >
>> > Same test as before, boot, run a series of simple benchmarks and then
>> > let the machine just idle away.
>> >
>> > As before, the reported stall, AND everything keeps on running as if
>> > nothing had happened.
>> >
>> > I notice in the task dump for both the cpus, the swapper is running on
>> > both cpus.
>> >
>> > does that make any sense?
>> > There is around 3% of memory actually used.
>> >
>> > Anyway, please find attached a copy of the dmesg output.
>> >
>> > Hope this helps a few people fill in the missing pieces here.
>>
>> What seems weird here is that all code paths in the loop
>> perform a WRITE_ONCE(rsp->gp_activity, jiffies), which
>> implies progress in each case:
>>
>> - rcu_gp_init() does it,
>> - both branches in the QS forcing loop do it, either
>> through rcu_gp_fqs(), or directly,
>>
>> This means the thread is really stalled, and the backtrace
>> shows those threads are stalled on the
>>
>> ret = wait_event_interruptible_timeout(rsp->gp_wq,
>> rcu_gp_fqs_check_wake(rsp, &gf), j);
>>
>> Since this is a *_timeout wait, for which the timeout
>> is bounded by "j" jiffies which is bounded by "HZ" value,
>> we should really not stay there too long, even if we are
>> not awakened by whatever is supposed to awaken us.
>
> Completely agreed on this seeming weird. ;-)
>
>> So unless I'm missing something, it would look like
>> schedule_timeout() is missing its timeout there.
>>
>> Perhaps we only experience this missed timeout here
>> because typically there is always a wakeup coming sooner
>> or later on relatively busy systems. This one is idle
>> for quite a while.
>>
>> Thoughts ?
>
> I can also make this happen (infrequently) on a busy system with
> rcutorture, but only with frequent CPU hotplugging. Ross is making
> it happen with pure idle.
>
> I did manage to make this fail with ftrace running, but thus far
> have not been able to get a trace that actually includes any
> activity for the grace-period kthread. Working on tightening
> up the tests...
>
> Thanx, Paul
>
>> Thanks,
>>
>> Mathieu
>>
>>
>> >
>> > Regards,
>> >
>> > Ross Green
>>
>> --
>> Mathieu Desnoyers
>> EfficiOS Inc.
>> http://www.efficios.com
>
>

It is only happening on the pandaboard.
I always use this for testing a new kernel as it is a good generic
armv7 system with not too much exotic hardware attached to it.

Thinking about what else might be different with this system ...
The realtime clock is not battery backed up so I use ntpd to adjust the time.
just wondering if there might be something weird going on when there
is a time adjustment. Certainly sets time at startup once network is
established.
Gets the correct time.
I assumed over such short times the time differentials and drift would
not be that great.

I do have several other armv7 machines that are running usually with
faster and more hardware on board, but have not run these with latest
software.

the only other machine I have tested these later kernels on is intel
based laptops.
Never seen these sort of stalls there.
They are on the whole very active machines.
Loads more software on board and usually some gui graphics interface.
This pandaboard is running without any graphics.

I will include the kernel .config I have for the machine to see if
that sheds any light on things.

As Paul has said this pandaboard is really just idling along, with no
real work happening.
There are only a handful of daemons running in the background that
should not be doing very much, most of the time.

the only CPU intensive load I run is a series of crypto and graphics
bit blit operations that take around 10-20 minutes to complete. Then I
leave the machine alone.

Regards,

Ross Green

Attachment: kconfig-4.4
Description: Binary data

Next message: Eddie Huang: "[PATCH] ARM64: dts: Mediatek: mt8173-evb: fix access MMC fail issue"
Previous message: Sergey Senozhatsky: "Re: [RFC PATCH 3/3] mm/zsmalloc: change ZS_MAX_PAGES_PER_ZSPAGE"
In reply to: Paul E. McKenney: "Re: rcu_preempt self-detected stall on CPU from 4.5-rc3, since 3.17"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]