Re: [PATCH] riscv: entry: Fixup do_trap_break from kernel side

From: Peter Zijlstra

Date: Mon Jun 22 2026 - 04:30:04 EST


On Sun, Jun 21, 2026 at 02:52:46AM -0400, Guo Ren wrote:
> On Fri, Jun 19, 2026 at 04:54:53PM -0700, Kees Cook wrote:
> > *thread encromancy*
> >
> > On Sat, Jul 01, 2023 at 10:57:07PM -0400, guoren@xxxxxxxxxx wrote:
> > > From: Guo Ren <guoren@xxxxxxxxxxxxxxxxx>
> > >
> > > The irqentry_nmi_enter/exit would force the current context into in_interrupt.
> > > That would trigger the kernel to dead panic, but the kdb still needs "ebreak" to
> > > debug the kernel.
> > >
> > > Move irqentry_nmi_enter/exit to exception_enter/exit could correct handle_break
> > > of the kernel side.
> > >
> > > Before the fixup:
> > > $echo BUG > /sys/kernel/debug/provoke-crash/DIRECT
> > > lkdtm: Performing direct entry BUG
> > > ------------[ cut here ]------------
> > > kernel BUG at drivers/misc/lkdtm/bugs.c:78!
> > > [...]
> > > Kernel panic - not syncing: Aiee, killing interrupt handler!
> >
> > This appears to still be unfixed. What's the blocker? The solutions in
> > this thread seem to work...
> >
> > I'd like to be exercising an Oops path via KUnit (for KCFI), and riscv
> > just instantly falls over instead of thread-killing on the exception.
> Thanks for reviving this thread. At the time I didn’t fully understand
> Peter’s point. We should only use the NMI path when the trap occurs with
> interrupts disabled.
> Here’s the updated fix:
>
> do_trap_break(struct pt_regs *regs)
> ...
> irqentry_exit_to_user_mode(regs);
> } else {
> - irqentry_state_t state = irqentry_nmi_enter(regs);
> + if (regs->status & SR_IE) {
> + enum ctx_state prev_state = exception_enter();
>
> - handle_break(regs);
> + handle_break(regs);
>
> - irqentry_nmi_exit(regs, state);
> + exception_exit(prev_state);
> + } else {
> + irqentry_state_t state = irqentry_nmi_enter(regs);
> +
> + handle_break(regs);
> +
> + irqentry_nmi_exit(regs, state);
> + }
> }
> }
>
> If you & Peter have no objection, I’ll post a v2.

I still don't understand it. This cannot fix anything. Consider:

EBREAK
raw_spin_lock_irq(&your_lock)
EBREAK

So now the first 'works', but the second will crash. Additionally,
having the EBREAK context differ so dramatically between invocations
seems like a very bad deal to me.