Re: [RFC perf/core 05/11] uprobes: Add mapping for optimized uprobe trampolines
From: Mark Rutland
Date: Thu Nov 21 2024 - 14:39:35 EST
[resending as I somehow messed up the 'From' header and got a tonne of
bounces]
On Thu, Nov 21, 2024 at 08:47:56AM -0800, Alexei Starovoitov wrote:
> On Thu, Nov 21, 2024 at 8:34 AM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> > Elsewhere in the thread Mark Rutland already noted that arm64 really
> > doesn't need or want this.
>
> Doesn't look like you've read what you quoted above.
> On arm64 the _HW_ cost may be the same.
> The _SW_ difference in handling trap vs syscall is real.
> I bet once uprobe syscall is benchmarked on arm64 there will
> be a delta.
I already pointed out in [1] that on arm64 we can make the trap case
*faster* than the syscall. If that's not already the case, there's only
a small amount of rework needed, (pulling BRK handling into
entry-common.c), which we want to do for other reasons anyway.
On arm64 I do not want the syscall; the trap is faster and simpler to
maintain.
Mark
[1] https://lore.kernel.org/lkml/ZzsRfhGSYXVK0mst@J2N7QTR9R3/