Re: mm: OS boot failed when set command-line kmemcheck=1

From: Vegard Nossum
Date: Wed Feb 26 2014 - 05:14:57 EST


On 26 February 2014 09:43, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> On Wed, Feb 19, 2014 at 02:24:41PM -0800, David Rientjes wrote:
>> On Wed, 19 Feb 2014, Xishi Qiu wrote:
>>
>> > Here is a warning, I don't whether it is relative to my hardware.
>> > If set "kmemcheck=1 nowatchdog", it can boot.
>> >
>> > code:
>> > ...
>> > pte = kmemcheck_pte_lookup(address);
>> > if (!pte)
>> > return false;
>> >
>> > WARN_ON_ONCE(in_nmi());
>> >
>> > if (error_code & 2)
>> > ...
>
> That code seems to assume NMI context cannot fault; this is false since
> a while back (v3.9 or thereabouts).
>
>> > [ 10.920757] [<ffffffff810452c1>] kmemcheck_fault+0xb1/0xc0
>> > [ 10.920760] [<ffffffff814d262b>] __do_page_fault+0x39b/0x4c0
>> > [ 10.920763] [<ffffffff814d2829>] do_page_fault+0x9/0x10
>> > [ 10.920765] [<ffffffff814cf222>] page_fault+0x22/0x30
>> > [ 10.920774] [<ffffffff8101eb02>] intel_pmu_handle_irq+0x142/0x3a0
>> > [ 10.920777] [<ffffffff814d0655>] perf_event_nmi_handler+0x35/0x60
>> > [ 10.920779] [<ffffffff814cfe83>] nmi_handle+0x63/0x150
>> > [ 10.920782] [<ffffffff814cffd3>] default_do_nmi+0x63/0x290
>> > [ 10.920784] [<ffffffff814d02a8>] do_nmi+0xa8/0xe0
>> > [ 10.920786] [<ffffffff814cf527>] end_repeat_nmi+0x1e/0x2e
>
> And this does indeed show a fault from NMI context; which is totally
> expected.
>
> kmemcheck needs to be fixed; but I've no clue how any of that works.

IIRC the reason we don't support page faults in NMI context is that we
may already be handling an existing fault (or trap) when the NMI hits.
So that would mess up kmemcheck's working state. I don't really see
that anything has changed in this respect lately, so it could always
have been broken.

I think the way we dealt with this before was just to make sure than
NMI handlers don't access any kmemcheck-tracked memory (i.e. to make
sure that all memory touched by NMI handlers has been marked NOTRACK).
And the purpose of this warning is just to tell us that something
inside an NMI triggered a page fault (in this specific case, it seems
to be intel_pmu_handle_irq).

I guess there are two ways forward:

- create a stack of things that kmemcheck is working on, so that we
handle recursive page faults
- try to figure out why intel_pmu_handle_irq() faults and add a
(kmemcheck-specific) workaround for it

Incidentally, do you remember what exactly changed wrt page faults in
NMI context?


Vegard
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/