Re: [PATCH] x86/entry: Treat BUG/WARN as NMI-like entries
From: Andy Lutomirski
Date: Fri Jun 12 2020 - 00:14:02 EST
On Thu, Jun 11, 2020 at 8:26 PM Andy Lutomirski <luto@xxxxxxxxxx> wrote:
>
> If we BUG or WARN in a funny RCU context, we cleverly optimize the
> BUG/WARN using the ud2 hack, which takes us through the
> idtentry_enter...() paths, which might helpfully WARN that the RCU
> context is invalid, which results in infinite recursion.
>
> Split the BUG/WARN handling into an nmi_enter()/nmi_exit() path in
> exc_invalid_op() to increase the chance that we survive the
> experience.
>
> Signed-off-by: Andy Lutomirski <luto@xxxxxxxxxx>
> ---
>
> This is not as well tested as I would like, but it does cause the splat
> I'm chasing to display a nice warning instead of causing an undebuggable
> stack overflow.
>
> (It would have been debuggable on x86_64, but it's a 32-bit splat, and
> x86_32 doesn't have ORC.)
>
> arch/x86/kernel/traps.c | 61 +++++++++++++++++++++++------------------
> arch/x86/mm/extable.c | 15 ++++++++--
> 2 files changed, 48 insertions(+), 28 deletions(-)
>
> diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
> index cb8c3d26cdf5..6340b12a6616 100644
> --- a/arch/x86/kernel/traps.c
> +++ b/arch/x86/kernel/traps.c
> @@ -98,24 +98,6 @@ int is_valid_bugaddr(unsigned long addr)
> return ud == INSN_UD0 || ud == INSN_UD2;
> }
>
> -int fixup_bug(struct pt_regs *regs, int trapnr)
> -{
> - if (trapnr != X86_TRAP_UD)
> - return 0;
> -
> - switch (report_bug(regs->ip, regs)) {
> - case BUG_TRAP_TYPE_NONE:
> - case BUG_TRAP_TYPE_BUG:
> - break;
> -
> - case BUG_TRAP_TYPE_WARN:
> - regs->ip += LEN_UD2;
> - return 1;
> - }
> -
> - return 0;
> -}
> -
> static nokprobe_inline int
> do_trap_no_signal(struct task_struct *tsk, int trapnr, const char *str,
> struct pt_regs *regs, long error_code)
> @@ -191,13 +173,6 @@ static void do_error_trap(struct pt_regs *regs, long error_code, char *str,
> {
> RCU_LOCKDEP_WARN(!rcu_is_watching(), "entry code didn't wake RCU");
>
> - /*
> - * WARN*()s end up here; fix them up before we call the
> - * notifier chain.
> - */
> - if (!user_mode(regs) && fixup_bug(regs, trapnr))
> - return;
> -
> if (notify_die(DIE_TRAP, str, regs, error_code, trapnr, signr) !=
> NOTIFY_STOP) {
> cond_local_irq_enable(regs);
> @@ -242,9 +217,43 @@ static inline void handle_invalid_op(struct pt_regs *regs)
> ILL_ILLOPN, error_get_trap_addr(regs));
> }
>
> -DEFINE_IDTENTRY(exc_invalid_op)
> +DEFINE_IDTENTRY_RAW(exc_invalid_op)
> {
> + bool rcu_exit;
> +
> + /*
> + * Handle BUG/WARN like NMIs instead of like normal idtentries:
> + * if we bugged/warned in a bad RCU context, for example, the last
> + * thing we want is to BUG/WARN again in the idtentry code, ad
> + * infinitum.
> + */
> + if (!user_mode(regs) && is_valid_bugaddr(regs->ip)) {
> + enum bug_trap_type type;
> +
> + nmi_enter();
> + instrumentation_begin();
> + type = report_bug(regs->ip, regs);
> + instrumentation_end();
> + nmi_exit();
Hmm, maybe this should be:
nmi_enter();
instrumentation_begin();
trace_hardirqs_off_finish();
type = report_bug(regs->ip, regs);
if (regs->flags & X86_EFLAGS_IF)
trace_hardirqs_on_prepare();
instrumentation_end();
nmi_exit();
tglx or peterz, feel free to fix this up and apply it however you like.