Re: [PATCH v3 2/2] x86/fault: Decode and print #PF oops in human readable form

From: Dmitry Vyukov
Date: Mon Apr 29 2019 - 09:58:34 EST


On Fri, Dec 21, 2018 at 10:37 PM Sean Christopherson
<sean.j.christopherson@xxxxxxxxx> wrote:
>
> Linus pointed out that deciphering the raw #PF error code and printing
> a more human readable message are two different things, and also that
> printing the negative cases is mostly just noise[1]. For example, the
> USER bit doesn't mean the fault originated in user code and stating
> that an oops wasn't due to a protection keys violation isn't interesting
> since an oops on a keys violation is a one-in-a-million scenario.
>
> Remove the per-bit decoding of the error code and instead print:
> - the raw error code
> - why the fault occurred
> - the effective privilege level of the access
> - the type of access
> - whether the fault originated in user code or kernel code
>
> This provides the user with the information needed to triage 99.9% of
> oopses without polluting the log with useless information or conflating
> the error_code with the CPL.
>
> Sample output:
>
> BUG: kernel NULL pointer dereference, address = 0000000000000008
> #PF: supervisor-privileged instruction fetch from kernel code
> #PF: error_code(0x0010) - not-present page
>
> BUG: unable to handle page fault for address = ffffbeef00000000
> #PF: supervisor-privileged instruction fetch from kernel code
> #PF: error_code(0x0010) - not-present page
>
> BUG: unable to handle page fault for address = ffffc90000230000
> #PF: supervisor-privileged write access from kernel code
> #PF: error_code(0x000b) - reserved bit violation



As a note for future.
I see 3 recent changes that shuffle crash format forth and back:

ea2f8d60603efbd1cb4e193a593945a2fe24d264 x86/fault: Make fault
messages more succinct
f28b11a2abd9cf5535baa03e28fc63ad0e14f52a x86/fault: Reword initial BUG
message for unhandled page faults
18ea35c5ed9921867194a8efc2a0ac2d5a3c7d2a x86/fault: Decode and print
#PF oops in human readable form

Ideally, such changes are coordinated with kernel testing for gradual
rollout. Since kernel does not provide an official facility for crash
parsing, the actual output effectively becomes part of public API.
Such changes in format may lead to glues bugs, not duped bugs and
missed bugs (including incorrect bisection, when something stopped
being detected as a crash at all).
Ideally-ideally, kernel provides some kind official output parsing
facility (that can understand when kernel has bugged, when the crash
starts/end and some kind of stable identity for a crash) and includes
output parsing tests for all types of crashes because this types
kernel sabotaging testing happens periodically.

We already have samples for "address =" format, but not for "address:"
yet. So I still can't update parsing for the final format and we have
massively glued crash reports.






> [1] https://lkml.kernel.org/r/CAHk-=whk_fsnxVMvF1T2fFCaP2WrvSybABrLQCWLJyCvHw6NKA@xxxxxxxxxxxxxx
>
> Suggested-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
> Cc: Andy Lutomirski <luto@xxxxxxxxxx>
> Cc: Borislav Petkov <bp@xxxxxxxxx>
> Cc: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>
> Cc: H. Peter Anvin <hpa@xxxxxxxxx>
> Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> Cc: Rik van Riel <riel@xxxxxxxxxxx>
> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Cc: Yu-cheng Yu <yu-cheng.yu@xxxxxxxxx>
> Cc: linux-kernel@xxxxxxxxxxxxxxx
> Cc: Ingo Molnar <mingo@xxxxxxxxxx>
> Signed-off-by: Sean Christopherson <sean.j.christopherson@xxxxxxxxx>
> ---
> arch/x86/mm/fault.c | 42 +++++++++++-------------------------------
> 1 file changed, 11 insertions(+), 31 deletions(-)
>
> diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
> index 39dccdfef496..a4421cbd230b 100644
> --- a/arch/x86/mm/fault.c
> +++ b/arch/x86/mm/fault.c
> @@ -603,24 +603,9 @@ static void show_ldttss(const struct desc_ptr *gdt, const char *name, u16 index)
> name, index, addr, (desc.limit0 | (desc.limit1 << 16)));
> }
>
> -/*
> - * This helper function transforms the #PF error_code bits into
> - * "[PROT] [USER]" type of descriptive, almost human-readable error strings:
> - */
> -static void err_str_append(unsigned long error_code, char *buf, unsigned long mask, const char *txt)
> -{
> - if (error_code & mask) {
> - if (buf[0])
> - strcat(buf, " ");
> - strcat(buf, txt);
> - }
> -}
> -
> static void
> show_fault_oops(struct pt_regs *regs, unsigned long error_code, unsigned long address)
> {
> - char err_txt[64];
> -
> if (!oops_may_print())
> return;
>
> @@ -651,27 +636,22 @@ show_fault_oops(struct pt_regs *regs, unsigned long error_code, unsigned long ad
> pr_alert("BUG: unable to handle page fault for address = %px\n",
> (void *)address);
>
> - err_txt[0] = 0;
> -
> - /*
> - * Note: length of these appended strings including the separation space and the
> - * zero delimiter must fit into err_txt[].
> - */
> - err_str_append(error_code, err_txt, X86_PF_PROT, "[PROT]" );
> - err_str_append(error_code, err_txt, X86_PF_WRITE, "[WRITE]");
> - err_str_append(error_code, err_txt, X86_PF_USER, "[USER]" );
> - err_str_append(error_code, err_txt, X86_PF_RSVD, "[RSVD]" );
> - err_str_append(error_code, err_txt, X86_PF_INSTR, "[INSTR]");
> - err_str_append(error_code, err_txt, X86_PF_PK, "[PK]" );
> -
> - pr_alert("#PF error: %s\n", error_code ? err_txt : "[normal kernel read fault]");
> + pr_alert("#PF: %s-privileged %s from %s code\n",
> + (error_code & X86_PF_USER) ? "user" : "supervisor",
> + (error_code & X86_PF_INSTR) ? "instruction fetch" :
> + (error_code & X86_PF_WRITE) ? "write access" :
> + "read access",
> + user_mode(regs) ? "user" : "kernel");
> + pr_alert("#PF: error_code(0x%04lx) - %s\n", error_code,
> + !(error_code & X86_PF_PROT) ? "not-present page" :
> + (error_code & X86_PF_RSVD) ? "reserved bit violation" :
> + (error_code & X86_PF_PK) ? "protection keys violation" :
> + "permissions violation");
>
> if (!(error_code & X86_PF_USER) && user_mode(regs)) {
> struct desc_ptr idt, gdt;
> u16 ldtr, tr;
>
> - pr_alert("This was a system access from user code\n");
> -
> /*
> * This can happen for quite a few reasons. The more obvious
> * ones are faults accessing the GDT, or LDT. Perhaps
> --
> 2.19.2
>