Re: Crashes with 874bbfe600a6 in 3.18.25

From: Thomas Gleixner
Date: Wed Feb 03 2016 - 14:07:23 EST


On Wed, 3 Feb 2016, Tejun Heo wrote:
> On Wed, Feb 03, 2016 at 07:46:11PM +0100, Thomas Gleixner wrote:
> > > > So I think 874bbfe600a6 is really bogus. It should be reverted. We
> > > > already have a proper fix for vmstat 176bed1de5bf ("vmstat: explicitly
> > > > schedule per-cpu work on the CPU we need it to run on"). This which
> > > > should be used for the stable trees as a replacement.
> > >
> > > It's not bogus. We can't flip a property that has been guaranteed
> > > without any provision for verification. Why do you think vmstat blow
> > > up in the first place? vmstat would be the canary case as it runs
> > > frequently on all systems. It's exactly the sign that we can't break
> > > this guarantee willy-nilly.
> >
> > You're in complete failure denial mode once again.
>
> Well, you're in an unnecessary escalation mode as usual. Was the
> attitude really necessary? Chill out and read the thread again.
> Michal is saying the dwork->cpu assignment was bogus and I was
> refuting that.

Right, but at the same time you could have admitted, that the current state is
buggy and needs a sanity check in unbound_pwq_by_node().

> Michal brought it up here but there's a different thread where Mike
> reported NUMA_NO_NODE issue and I already posted the fix.
>
> http://lkml.kernel.org/g/20160203185425.GK14091@xxxxxxxxxxxxxxx

5 minute ago w/o cc'ing the people who participated in that discussion.

Thanks,

tglx