Re: [RFC PATCH] x86 NMI-safe INT3 and Page Fault (v3)
From: Andi Kleen
Date: Thu Apr 17 2008 - 12:45:57 EST
Mathieu Desnoyers wrote:
> * Jeremy Fitzhardinge (jeremy@xxxxxxxx) wrote:
>> Mathieu Desnoyers wrote:
>>> "This way lies madness. Don't go there."
>>>
>> It is a large amount of... stuff. This immediate values thing makes a big
>> improvement then?
>>
>
> As ingo said : the nmi-safe traps and exception is not only usefu lto
> immediate values, but also to oprofile.
How is it useful to oprofile?
> On top of that, the LTTng kernel
> tracer has to write into vmalloc'd memory, so it's required there too.
All this effort changing really critical (and also fragile) code paths
used all the time is to handle setting markers into NMI functions. Or
actually the special case of setting markers in there that access
vmalloc() without calling vmalloc_sync().
NMI are maybe 5-6 functions all over the kernel.
I just don't think it makes any sense to put markers in there.
It is a really small part of the kernel the kernel that is unlikely
to be really useful for anybody. You should rather first solve the
problem of tracing the other 99.999999% of the kernel properly.
And then you could actually set the markers in there if you're
crazy enough, just call vmalloc_sync().
Mathieu argued earlier that markers should be set everywhere but
that is also bogus because there is enough other code where
you cannot set them either (one example would be early boot code[1])
And to do anything in NMI context you cannot use any locks so you would
have to write all data structures used by the markers lock less. I did
that for the the new mce code, but it's a really painful and bug prone
experience that I cannot really recommend to anybody.
And then NMIs (and machine checks) are a really obscure case, very
rarely used.
I think the right way is just to say that you cannot set markers
into NMI and machine check. Even with this patch it is highly unlikely
the resulting code will be correct anyways. Actually you could probably
set them without the patch with some effort (like calling vmalloc_sync),
but for the basic reasons mentioned above (lock less code is really
hard, nmi type functions are less than hundred lines in the millions
of kernel LOCs) it is just a very very bad idea.
-Andi
[1] Now that I mentioned it I still have enough faith to assume nobody
will be crazy enough to come up with some horrible hack to set markers
in early boot code too. But after seeing this patchkit ending up in a
git tree I'm not sure.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/