Re: [lkp-robot] [x86] 69218e4799: BUG:kernel_hang_in_boot_stage

From: Thomas Garnier
Date: Tue Mar 21 2017 - 16:25:49 EST


The issue seems to be related to exceptions happening in close pages
to the fixmap GDT remapping.

The original page fault happen in do_test_wp_bit which set a fixmap
entry to test WP flag. If I grow the number of processors supported
increasing the distance between the remapped GDT page and the WP test
page, the error does not reproduce.

I am still looking at the exact distance between repro and no-repro as
well as the exact root cause.

On Tue, Mar 21, 2017 at 12:23 PM, Thomas Garnier <thgarnie@xxxxxxxxxx> wrote:
> On Tue, Mar 21, 2017 at 12:20 PM, Linus Torvalds
> <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>>
>> On Tue, Mar 21, 2017 at 11:16 AM, Thomas Garnier <thgarnie@xxxxxxxxxx> wrote:
>> > This error happens even with Andy TLS fix on 32-bit (GDT is on fixmap
>> > but not readonly). I am looking into it.
>> >
>> > KVM internal error. Suberror: 3
>> > extra data[0]: 80000b0e
>> > extra data[1]: 31
>>
>> If I read that right, it's extra data[1] 0x31, which EXIT_REASON_EPT_MISCONFIG.
>>
>> I'm not seeing how the A bit in a GDT entry could have anything to do
>> with it. I'm assuming it happens even without Andy's patch?
>
> Correct.
>
>>
>> Linus
>
>
>
>
> --
> Thomas



--
Thomas