Re: [PATCH v4 2/9] arm64: perf: Enable pmu counter direct access for perf event on armv8

From: Rob Herring
Date: Wed Dec 02 2020 - 09:58:38 EST


On Fri, Nov 20, 2020 at 02:03:45PM -0600, Rob Herring wrote:
> On Thu, Nov 19, 2020 at 07:15:15PM +0000, Will Deacon wrote:
> > On Fri, Nov 13, 2020 at 06:06:33PM +0000, Mark Rutland wrote:
> > > On Thu, Oct 01, 2020 at 09:01:09AM -0500, Rob Herring wrote:
> > > > +static void armv8pmu_event_unmapped(struct perf_event *event, struct mm_struct *mm)
> > > > +{
> > > > + if (!(event->hw.flags & ARMPMU_EL0_RD_CNTR))
> > > > + return;
> > > > +
> > > > + if (atomic_dec_and_test(&mm->context.pmu_direct_access))
> > > > + on_each_cpu_mask(mm_cpumask(mm), refresh_pmuserenr, NULL, 1);
> > > > +}
> > >
> > > I didn't think we kept our mm_cpumask() up-to-date in all cases on
> > > arm64, so I'm not sure we can use it like this.
> > >
> > > Will, can you confirm either way?
> >
> > We don't update mm_cpumask() as the cost of the atomic showed up in some
> > benchmarks I did years ago and we've never had any need for the thing anyway
> > because out TLB invalidation is one or all.
>
> That's good because we're also passing NULL instead of mm which would
> crash. So it must be more than it's not up to date, but it's always 0.
> It looks like event_mapped on x86 uses mm_cpumask(mm) which I guess was
> dropped when copying this code as it didn't work... For reference, the
> x86 version of this originated in commit 7911d3f7af14a6.
>
> I'm not clear on why we need to update pmuserenr_el0 here anyways. To
> get here userspace has to mmap the event and then unmmap it. If we did
> nothing, then counter accesses would not fault until the next context
> switch.
>
> If you all have any ideas, I'm all ears. I'm not a scheduler nor perf
> hacker. ;)

Mark, Will, any thoughts on this?

Rob