Re: [PATCH v2 1/2] x86/msr: Carry on after a non-"safe" MSR access fails without !panic_on_oops

From: Andy Lutomirski
Date: Fri Mar 11 2016 - 11:48:39 EST


On Thu, Oct 1, 2015 at 12:15 AM, Ingo Molnar <mingo@xxxxxxxxxx> wrote:
>
> * Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:
>
>> > These could still be open coded in an inlined fashion, like the scheduler usage.
>>
>> We could have a raw_rdmsr for those.
>>
>> OTOH, I'm still not 100% convinced that this warn-but-don't-die behavior is
>> worth the effort. This isn't a frequent source of bugs to my knowledge, and we
>> don't try to recover from incorrect cr writes, out-of-bounds MMIO, etc, so do we
>> really gain much by rigging a recovery mechanism for rdmsr and wrmsr failures
>> for code that doesn't use the _safe variants?
>
> It's just the general principle really: don't crash the kernel on bootup. There's
> few things more user hostile than that.
>
> Also, this would maintain the status quo: since we now (accidentally) don't crash
> the kernel on distro kernels (but silently and unsafely ignore the faulting
> instruction), we should not regress that behavior (by adding the chance to crash
> again), but improve upon it.

Just a heads up: the extable improvements in tip:ras/core make it
straightforward to get the best of all worlds: explicit failure
handling (written in C!), no fast path overhead whatsoever, and no new
garbage in the exception handlers.

Patches coming once I test them.

>
> Thanks,
>
> Ingo



--
Andy Lutomirski
AMA Capital Management, LLC