Re: [PATCH v1 3/3] perf auxtrace arm: Support compat_auxtrace_mmap__{read_head|write_tail}

From: Leo Yan
Date: Mon Aug 23 2021 - 09:30:54 EST


Hi James,

On Mon, Aug 23, 2021 at 01:23:42PM +0100, James Clark wrote:
>
>
> On 09/08/2021 12:27, Leo Yan wrote:
> > When the tool runs with compat mode on Arm platform, the kernel is in
> > 64-bit mode and user space is in 32-bit mode; the user space can use
> > instructions "ldrd" and "strd" for 64-bit value atomicity.
> >
> > This patch adds compat_auxtrace_mmap__{read_head|write_tail} for arm
> > building, it uses "ldrd" and "strd" instructions to ensure accessing
> > atomicity for aux head and tail. The file arch/arm/util/auxtrace.c is
> > built for arm and arm64 building, these two functions are not needed for
> > arm64, so check the compiler macro "__arm__" to only include them for
> > arm building.
> >
> > Signed-off-by: Leo Yan <leo.yan@xxxxxxxxxx>
> > ---
> > tools/perf/arch/arm/util/auxtrace.c | 32 +++++++++++++++++++++++++++++
> > 1 file changed, 32 insertions(+)
> >
> > diff --git a/tools/perf/arch/arm/util/auxtrace.c b/tools/perf/arch/arm/util/auxtrace.c
> > index b187bddbd01a..c7c7ec0812d5 100644
> > --- a/tools/perf/arch/arm/util/auxtrace.c
> > +++ b/tools/perf/arch/arm/util/auxtrace.c
> > @@ -107,3 +107,35 @@ struct auxtrace_record
> > *err = 0;
> > return NULL;
> > }
> > +
> > +#if defined(__arm__)
> > +u64 compat_auxtrace_mmap__read_head(struct auxtrace_mmap *mm)
> > +{
> > + struct perf_event_mmap_page *pc = mm->userpg;
> > + u64 result;
> > +
> > + __asm__ __volatile__(
> > +" ldrd %0, %H0, [%1]"
> > + : "=&r" (result)
> > + : "r" (&pc->aux_head), "Qo" (pc->aux_head)
> > + );
> > +
> > + return result;
> > +}
>
> Hi Leo,
>
> I see that this is a duplicate of the atomic read in arch/arm/include/asm/atomic.h

Exactly.

> For x86, it's possible to include tools/include/asm/atomic.h, but that doesn't
> include arch/arm/include/asm/atomic.h and there are some other #ifdefs that might
> make it not so easy for Arm. Just wondering if you considered trying to include the
> existing one? Or decided that it was easier to duplicate it?

Good finding!

With you reminding, I recognized that the atomic operations for
arm/arm64 should be improved for user space program. So far, perf tool
simply uses the compiler's atomic implementations (from
asm-generic/atomic-gcc.h) for arm/arm64; but for a more reliable
implementation, I think we should improve the user space program with
architecture's atomic instructions.

So I think your question should be converted to: should we export the
arm/arm64 atomicity operations to user space program? Seems to me this
is a challenge work, we need at least to finish below items:

- Support arm64 atomic operations and reuse kernel's
arch/arm/include/asm/atomic.h;
- Support arm atomic operation and reuse kernel's
arch/arm/include/asm/atomic.h;
- For aarch32 building, we need to use configurations to distinguish
different cases, like LPAE, Armv7, and ARMv6 variants (so far I have
no idea how to use a graceful way to distinguish these different
building in perf tool).

I am not sure if there have any existed ongoing effort for this part,
if anyone is working on this (or before have started related work),
then definitely we should look into how we can reuse the arch's
atomic headers.

Otherwise, I prefer to firstly merge this patch with dozen lines of
duplicate code; afterwards, we can send a separate patch set to
support arm/arm64 atomic operations in user space.

If any Arm/Arm64 maintainers could shed some light for this part work,
I think it would be very helpful.

> Other than that, I have tested that the change works with a 32bit build with snapshot
> and normal mode.
>
> Reviewed by: James Clark <james.clark@xxxxxxx>
> Tested by: James Clark <james.clark@xxxxxxx>

Thanks for test and review!

Leo