Re: [PATCH] arm64: insn: Simulate nop and push instruction for better uprobe performance

From: Mark Rutland
Date: Mon Sep 09 2024 - 06:18:21 EST


On Fri, Sep 06, 2024 at 10:46:00AM -0700, Andrii Nakryiko wrote:
> On Fri, Sep 6, 2024 at 2:39 AM Mark Rutland <mark.rutland@xxxxxxx> wrote:
> >
> > On Tue, Aug 27, 2024 at 07:33:55PM +0800, Liao, Chang wrote:
> > > Hi, Mark
> > >
> > > Would you like to discuss this patch further, or do you still believe emulating
> > > STP to push FP/LR into the stack in kernel is not a good idea?
> >
> > I'm happy with the NOP emulation in principle, so please send a new
> > version with *just* the NOP emulation, and I can review that.
>
> Let's definitely start with that, this is important for faster USDT tracing.
>
> > Regarding STP emulation, I stand by my earlier comments, and in addition
> > to those comments, AFAICT it's currently unsafe to use any uaccess
> > routine in the uprobe BRK handler anyway, so that's moot. The uprobe BRK
> > handler runs with preemption disabled and IRQs (and all other maskable
> > exceptions) masked, and faults cannot be handled. IIUC
> > CONFIG_DEBUG_ATOMIC_SLEEP should scream about that.
>
> This part I don't really get, and this might be some very
> ARM64-specific issue, so I'm sorry ahead of time.
>
> But in general, at the lowest level uprobes work in two logical steps.
> First, there is a breakpoint that user space hits, kernel gets
> control, and if VMA which hit breakpoint might contain uprobe, kernel
> sets TIF_UPROBE thread flag and exits. This is the only part that's in
> hard IRQ context. See uprobe_notify_resume() and comments around it.
>
> Then uprobe infrastructure gets called in user context on the way back
> to user space. This is where we confirm that this is uprobe/uretprobe
> hit, and, if supported, perform instruction emulation.
>
> So I'm wondering if your above comment refers to instruction emulation
> within the first part of uprobe handling? If yes, then, no, that's not
> where emulation will happen.

You're right -- I had misunderstood that the emulation happened during
handling of the breakpoint, rather than on the return-to-userspace path.
Looking at the arm64 entry code, the way uprobe_notify_resume() is
plumbed in is safe as it happens after we've re-enabled preemption and
unmasked other exceptions.

Sorry about that.

For the moment I'd still prefer to get the NOP case out of the way
first, so I'll review the NOP-only patch shortly.

Mark.