Re: [PATCH v4 2/9] arm64: perf: Enable pmu counter direct access for perf event on armv8

From: Rob Herring
Date: Fri Nov 20 2020 - 15:04:10 EST


On Thu, Nov 19, 2020 at 07:15:15PM +0000, Will Deacon wrote:
> On Fri, Nov 13, 2020 at 06:06:33PM +0000, Mark Rutland wrote:
> > On Thu, Oct 01, 2020 at 09:01:09AM -0500, Rob Herring wrote:
> > > +static void armv8pmu_event_unmapped(struct perf_event *event, struct mm_struct *mm)
> > > +{
> > > + if (!(event->hw.flags & ARMPMU_EL0_RD_CNTR))
> > > + return;
> > > +
> > > + if (atomic_dec_and_test(&mm->context.pmu_direct_access))
> > > + on_each_cpu_mask(mm_cpumask(mm), refresh_pmuserenr, NULL, 1);
> > > +}
> >
> > I didn't think we kept our mm_cpumask() up-to-date in all cases on
> > arm64, so I'm not sure we can use it like this.
> >
> > Will, can you confirm either way?
>
> We don't update mm_cpumask() as the cost of the atomic showed up in some
> benchmarks I did years ago and we've never had any need for the thing anyway
> because out TLB invalidation is one or all.

That's good because we're also passing NULL instead of mm which would
crash. So it must be more than it's not up to date, but it's always 0.
It looks like event_mapped on x86 uses mm_cpumask(mm) which I guess was
dropped when copying this code as it didn't work... For reference, the
x86 version of this originated in commit 7911d3f7af14a6.

I'm not clear on why we need to update pmuserenr_el0 here anyways. To
get here userspace has to mmap the event and then unmmap it. If we did
nothing, then counter accesses would not fault until the next context
switch.

If you all have any ideas, I'm all ears. I'm not a scheduler nor perf
hacker. ;)

Rob