Re: [PATCH 0/3] arm64: perf: Make compat tracing better

From: Will Deacon
Date: Wed Jun 02 2021 - 13:56:08 EST


Hi Doug,

Thanks for posting this, and sorry for the delay in getting to it.

On Fri, May 07, 2021 at 01:55:10PM -0700, Douglas Anderson wrote:
> The goal for this series is to improve "perf" behavior when 32-bit
> userspace code is involved. This turns out to be fairly important for
> Chrome OS which still runs 32-bit userspace for the time being (long
> story there).

Watch out, your days are numbered! See [1].

> I won't repeat everything said in the individual patches since since
> they are wordy enough as it is.
>
> Please enjoy and I hope this isn't too ugly/hacky for inclusion in
> mainline.
>
> Thanks to Nick Desaulniers for his early review of these patches and
> to Ricky for the super early prototype that some of this is based on.

I can see that you've put a lot of effort into this, but I'm not thrilled
with the prospect of maintaining these heuristics in the kernel. The
callchain behaviour is directly visible to userspace, and all we'll be able
to do is throw more heuristics at it if faced with any regression reports.
Every assumption made about userspace behaviour results in diminishing
returns where some set of programs no longer fall into the "supported"
bucket and, on balance, I don't think the trade-off is worth it.

If we were to do this in the kernel, then I'd like to see a spec for how
frame-pointer based unwinding should work for Thumb and have it agreed
upon and implemented by both GCC and LLVM. That way, we can implement
the unwinder according to that spec and file bug reports against the
compiler if it goes wrong.

In lieu of that, I think we must defer to userspace to unwind using DWARF.
Perf supports this via PERF_SAMPLE_STACK_USER and PERF_SAMPLE_REGS_USER,
which allows libunwind to be used to create the callchain. You haven't
mentioned that here, so I'd be interested to know why not.

Finally, you've probably noticed that our unwinding code for compat tasks
is basically identical to the code in arch/arm/. If the functionality is
going to be extended, it should be done there first and then we will follow
to be compatible.

Cheers,

Will

[1] https://lore.kernel.org/lkml/20210602164719.31777-20-will@xxxxxxxxxx/T/#u