Re: Random panic in load_balance() with 3.16-rc

From: Nick Krause
Date: Fri Jul 25 2014 - 00:00:45 EST


On Thu, Jul 24, 2014 at 11:55 PM, Alexei Starovoitov <ast@xxxxxxxxxx> wrote:
> On Fri, Jul 25, 2014 at 10:25:03AM +0900, Michel DÃnzer wrote:
>> [ Adding the Debian kernel and gcc teams to Cc ]
>>
>> > movq $load_balance_mask, -136(%rbp) #, %sfp
>> > subq $184, %rsp #,
>> >
>> > Anyway, this is not a kernel bug. This is your compiler creating
>> > completely broken code. We may need to add a warning to make sure
>> > nobody compiles with gcc-4.9.0, and the Debian people should probably
>> > downgrate their shiny new compiler.
>>
>> Attached is fair.s from Debian gcc 4.8.3-5. Does that look better? I'm
>> going to try reproducing the problem with a kernel built by that now.
>
> 4.8 and 4.7 don't hit the problem on this test.
> 4.9 with -O2 compiles this file ok. 4.9 with -Os triggers it.
>
> -mno-red-zone only affected prologue emition in gcc. This part didn't
> change between the releases. So the bug is quite deep.
> What seems to be happening is that 2nd pass of instruction scheduler
> (after emit prologue and reg alloc) is ignoring barrier properties
> of 'subq $184, %rsp' and moving 'movq $.., -136(%rbp)' instruction
> ahead of it. afaik rtl sched was never aware of 'red-zone'.
> As an ex-compiler guy, I'm worried that this bug exists in earlier
> releases. rtl backend guys need to take a serious look at it.
> imo this is very serious bug, since broken red-zone is extremelly
> hard to debug.
> There are two weak test in gcc testsuite related to -mno-red-zone,
> but not a single test that actually check that it is doing
> the right thing. It is scary. I hope I'm wrong with this analysis.
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

Alexi,
Thanks for replying and sending this email to the Debian developers,
seems to me a very serious issue with gcc. I don't known much about
compilers but I trust your experience here.
Nick
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/