Re: Crashes with 874bbfe600a6 in 3.18.25

From: Petr Mladek
Date: Tue Jan 26 2016 - 06:14:48 EST


On Tue 2016-01-26 10:34:00, Jan Kara wrote:
> On Sat 23-01-16 17:11:54, Thomas Gleixner wrote:
> > On Sat, 23 Jan 2016, Ben Hutchings wrote:
> > > On Fri, 2016-01-22 at 11:09 -0500, Tejun Heo wrote:
> > > > > Looks like it requires more than trivial backport (I think). Tejun?
> > > >
> > > > The timer migration has changed quite a bit.  Given that we've never
> > > > seen vmstat work crashing in 3.18 era, I wonder whether the right
> > > > thing to do here is reverting 874bbfe600a6 from 3.18 stable?
> > >
> > > It's not just 3.18 that has this; 874bbfe600a6 was backported to all
> > > stable branches from 3.10 onward.  Only the 4.2-ckt branch has
> > > 22b886dd10180939.
> >
> > 22b886dd10180939 fixes a bug which was introduced with the timer wheel
> > overhaul in 4.2. So only 4.2/3 should have it backported.
>
> Thanks for explanation. So do I understand right that timers are always run
> on the calling CPU in kernels prior to 4.2 and thus commit 874bbfe600a6 (to
> run timer for delayed work on the calling CPU) doesn't make sense there? If
> that is true than reverting the commit from older stable kernels is
> probably the easiest way to resolve the crashes.

The commit 874bbfe600a6 ("workqueue: make sure delayed work run in
local cpu") forces the timer to run on the local CPU. It might be correct
for vmstat. But I wonder if it might break some other delayed work
user that depends on running on different CPU.

Best Regards,
Petr