Re: [PATCH v3 2/5] x86/msr: Carry on after a non-"safe" MSR access fails without !panic_on_oops

From: Andy Lutomirski
Date: Sat Mar 12 2016 - 12:33:18 EST


On Sat, Mar 12, 2016 at 7:36 AM, Ingo Molnar <mingo@xxxxxxxxxx> wrote:
>
> * Andy Lutomirski <luto@xxxxxxxxxx> wrote:
>
>> This demotes an OOPS and likely panic due to a failed non-"safe" MSR
>> access to a WARN and, for RDMSR, a return value of zero. If
>> panic_on_oops is set, then failed unsafe MSR accesses will still
>> oops and panic.
>>
>> To be clear, this type of failure should *not* happen. This patch
>> exists to minimize the chance of nasty undebuggable failures due on
>> systems that used to work due to a now-fixed CONFIG_PARAVIRT=y bug.
>>
>> Signed-off-by: Andy Lutomirski <luto@xxxxxxxxxx>
>> ---
>> arch/x86/include/asm/msr.h | 10 ++++++++--
>> arch/x86/mm/extable.c | 33 +++++++++++++++++++++++++++++++++
>> 2 files changed, 41 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/x86/include/asm/msr.h b/arch/x86/include/asm/msr.h
>> index 93fb7c1cffda..1487054a1a70 100644
>> --- a/arch/x86/include/asm/msr.h
>> +++ b/arch/x86/include/asm/msr.h
>> @@ -92,7 +92,10 @@ static inline unsigned long long native_read_msr(unsigned int msr)
>> {
>> DECLARE_ARGS(val, low, high);
>>
>> - asm volatile("rdmsr" : EAX_EDX_RET(val, low, high) : "c" (msr));
>> + asm volatile("1: rdmsr\n"
>> + "2:\n"
>> + _ASM_EXTABLE_HANDLE(1b, 2b, ex_handler_rdmsr_unsafe)
>> + : EAX_EDX_RET(val, low, high) : "c" (msr));
>> if (msr_tracepoint_active(__tracepoint_read_msr))
>> do_trace_read_msr(msr, EAX_EDX_VAL(val, low, high), 0);
>> return EAX_EDX_VAL(val, low, high);
>> @@ -119,7 +122,10 @@ static inline unsigned long long native_read_msr_safe(unsigned int msr,
>> static inline void native_write_msr(unsigned int msr,
>> unsigned low, unsigned high)
>> {
>> - asm volatile("wrmsr" : : "c" (msr), "a"(low), "d" (high) : "memory");
>> + asm volatile("1: wrmsr\n"
>> + "2:\n"
>> + _ASM_EXTABLE_HANDLE(1b, 2b, ex_handler_wrmsr_unsafe)
>> + : : "c" (msr), "a"(low), "d" (high) : "memory");
>> if (msr_tracepoint_active(__tracepoint_read_msr))
>> do_trace_write_msr(msr, ((u64)high << 32 | low), 0);
>> }
>> diff --git a/arch/x86/mm/extable.c b/arch/x86/mm/extable.c
>> index 9dd7e4b7fcde..f310714e6e6d 100644
>> --- a/arch/x86/mm/extable.c
>> +++ b/arch/x86/mm/extable.c
>> @@ -49,6 +49,39 @@ bool ex_handler_ext(const struct exception_table_entry *fixup,
>> }
>> EXPORT_SYMBOL(ex_handler_ext);
>>
>> +bool ex_handler_rdmsr_unsafe(const struct exception_table_entry *fixup,
>> + struct pt_regs *regs, int trapnr)
>> +{
>> + WARN(1, "unsafe MSR access error: RDMSR from 0x%x",
>> + (unsigned int)regs->cx);
>
> Btw., instead of the safe/unsafe naming (which has an emotional and security
> secondary attribute), shouldn't we move this over to a _check() (or _checking())
> naming instead that we do in other places in the kernel?
>
> I.e.:
>
> rdmsr(msr, l, h);
>
> and:
>
> if (rdmsr_check(msr, l, h)) {
> ...
> }
>
> and then we could name the helpers as _check() and _nocheck() - which is neutral
> naming.

Will do as a separate followup series.

At least with this series applied, the functions named _safe all point
to each other correctly.

--Andy