Re: Random panic in load_balance() with 3.16-rc

From: Peter Zijlstra
Date: Thu Jul 24 2014 - 03:52:09 EST


On Thu, Jul 24, 2014 at 04:18:48PM +0900, Michel Dänzer wrote:
> On 23.07.2014 18:31, Michel Dänzer wrote:
> > On 23.07.2014 18:25, Peter Zijlstra wrote:
> >> On Wed, Jul 23, 2014 at 10:28:19AM +0200, Peter Zijlstra wrote:
> >>
> >>> Of course, the other thing that patch did is clear sgp->power (now
> >>> sgc->capacity).
> >>
> >> Hmm, re-reading the thread there isn't a clear confirmation its this
> >> patch at all. Could you perhaps bisect this to either verify it is
> >> indeed that patch we're talking about:
> >>
> >> caffcdd8d27b ("sched: Do not zero sg->cpumask and sg->sgp->power in build_sched_groups()")
> >>
> >> or find which patch is causing this.
> >
> > It can take a long time for the problem to occur, so I need to run at
> > least for one or two days to be at least somewhat sure a given kernel is
> > not affected.
> >
> > I'll try reproducing the problem with your previous suggestions first,
>
> Just happened again, with your robustness patch and setting
> sg->sgc->capacity = 0.

Yeah, that pretty much confirms its not that patch :/

> > but if I manage to do that, I guess there's no alternative to bisecting...
>
> I hope the assembly output I sent earlier helps, I'm afraid bisecting
> this could be painful.

Yeah, lemme go have a look...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/