Re: Timer refuses to expire

From: Paul E. McKenney
Date: Thu Dec 07 2017 - 19:32:00 EST

Next message: Stephen Rothwell: "linux-next: manual merge of the net-next tree with the vfs tree"
Previous message: Andrew Morton: "Re: [PATCH -mm] mm, swap: Fix race between swapoff and some swap operations"
In reply to: Paul E. McKenney: "Re: Timer refuses to expire"
Next in thread: Boqun Feng: "Re: Timer refuses to expire"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Thu, Dec 07, 2017 at 01:45:14PM -0800, Paul E. McKenney wrote:
> On Thu, Dec 07, 2017 at 06:56:17AM -0800, Paul E. McKenney wrote:
> > On Thu, Dec 07, 2017 at 03:03:50PM +0800, Boqun Feng wrote:
>
> [ . . . ]
>
> > > > What I did instead was to dump out the state of the task that
> > > > __cpuhp_kick_ap() waits on, please see the patch at the very end of this
> > > > email. This triggered as shown below, and you guessed it, that task is
> > > > waiting on a grace period. Which I am guessing won't happen until the
> > > > outgoing CPU reaches CPUHP_TIMERS_DEAD state and calls timers_dead_cpu().
> > > > Which will prevent RCU's grace-period kthread from ever awakening, which
> > > > will prevent the task that __cpuhp_kick_ap() waits on from ever awakening,
> > > > which will prevent the outgoing CPU from reaching CPUHP_TIMERS_DEAD state.
> > > >
> > > > Deadlock.
> > >
> > > There is one thing I'm confused here. Sure, this is a deadlock, but the
> > > timer should still work in such a deadlock, right? I mean, the timer of
> > > schedule_timeout() should be able to wake up rcu_gp_kthread() even in
> > > this case? And yes, the gp kthread will continue to wait due to the
> > > deadlock, but the deadlock can not explain the "Waylayed timer", right?
> >
> > My belief is that the timer cannot fire because it is still on the
> > offlined CPU, and that CPU has not yet reached timers_dead_cpu().
> > But I might be missing something subtle in either the timers code or the
> > CPU-hotplug code, so please do check my reasoning here. (I am relying on
> > the "timer->flags: 0x40000007" and the "cpuhp/7" below, which I believe
> > means that the timer is on CPU 7 and that it is CPU 7 that is in the
> > process of going offline.)
> >
> > The "Waylayed timer" happens because the RCU CPU stall warning code
> > wakes up the grace-period kthread. This is driven out of the
> > scheduling-clock tick, so is unaffected by timers, though it does
> > rely on the jiffies counter continuing to be incremented.
> >
> > So what am I missing here?
>
> Well, last night's runs had situations where the ->flags CPU didn't
> match the CPU going offline, so I am clearly missing something or another.
>
> One thing I might have been missing was the CPU-online processing.
> What happens if a CPU goes offline, comes back online, but before ->clk
> gets adjusted there is a schedule_timeout()? Now, schedule_timeout()
> does compute the absolute ->expires time using jiffies, so the wakeup time
> should not be too far off of the desired time. Except that the timers
> now have something like 8% slop, and that slop will be calculated on the
> difference between the desired expiration time and the (way outdated)
> ->clk value. So the 8% might be a rather large number. For example,
> if the CPU was offline for 12 minutes (unlikely but entirely possible
> with rcutorture testing's random onlining and offlining), the slop on
> a 3-millisecond timer might be a full minute.
>
> To my timer-naive eyes, it looks like a simple fix is to set
> old_base->must_forward_clk to true in timers_dead_cpu() for each
> timer_base, as shown below. The other possibility that I considered
> was to instead set ->is_idle, but that looked like an engraved
> invitation to send IPIs to offline CPUs.
>
> I am giving it a spin. I still believe that the offline deadlock
> scenario can happen, but one thing at a time...
>
> Thoughts?

And it is hard to tell whether or not this is helping. Not too
surprising, given that most of the splats seem to be the deadlock
case instead.

Thanx, Paul

Next message: Stephen Rothwell: "linux-next: manual merge of the net-next tree with the vfs tree"
Previous message: Andrew Morton: "Re: [PATCH -mm] mm, swap: Fix race between swapoff and some swap operations"
In reply to: Paul E. McKenney: "Re: Timer refuses to expire"
Next in thread: Boqun Feng: "Re: Timer refuses to expire"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]