Re: [PATCH v3 01/10] arm64: pmu: Add hook to handle pmu-related undefined instructions
From: Rob Herring
Date: Tue Sep 29 2020 - 16:46:54 EST
On Tue, Sep 29, 2020 at 12:49 PM Will Deacon <will@xxxxxxxxxx> wrote:
>
> On Tue, Sep 29, 2020 at 08:46:46AM -0500, Rob Herring wrote:
> > On Mon, Sep 28, 2020 at 1:26 PM Will Deacon <will@xxxxxxxxxx> wrote:
> > > On Fri, Sep 11, 2020 at 03:51:09PM -0600, Rob Herring wrote:
> > > > +static int emulate_pmu(struct pt_regs *regs, u32 insn)
> > > > +{
> > > > + u32 rt;
> > > > + u32 pmuserenr;
> > > > +
> > > > + rt = aarch64_insn_decode_register(AARCH64_INSN_REGTYPE_RT, insn);
> > > > + pmuserenr = read_sysreg(pmuserenr_el0);
> > > > +
> > > > + if ((pmuserenr & (ARMV8_PMU_USERENR_ER|ARMV8_PMU_USERENR_CR)) !=
> > > > + (ARMV8_PMU_USERENR_ER|ARMV8_PMU_USERENR_CR))
> > > > + return -EINVAL;
> > > > +
> > > > +
> > > > + /*
> > > > + * Userspace is expected to only use this in the context of the scheme
> > > > + * described in the struct perf_event_mmap_page comments.
> > > > + *
> > > > + * Given that context, we can only get here if we got migrated between
> > > > + * getting the register index and doing the MSR read. This in turn
> > > > + * implies we'll fail the sequence and retry, so any value returned is
> > > > + * 'good', all we need is to be non-fatal.
> > > > + *
> > > > + * The choice of the value 0 is comming from the fact that when
> > > > + * accessing a register which is not counting events but is accessible,
> > > > + * we get 0.
> > > > + */
> > > > + pt_regs_write_reg(regs, rt, 0);
> > >
> > > Hmm... this feels pretty fragile since, although we may expect userspace only
> > > to trigger this in the context of the specific perf use-case, we don't have
> > > a way to detect that, so the ABI we're exposing is that EL0 accesses to
> > > non-existent counters will return 0. I don't really think that's something
> > > we want to commit to.
> > >
> > > When restartable sequences were added to the kernel, one of the proposed
> > > use-cases was to allow PMU access on big/little systems, because the
> > > sequence will abort on preemption. Taking that approach removes the need
> > > for this emulation hook entirely. Is that something we can rely on instead
> > > of this emulation hook?
> >
> > So back to the RFC version[1]!? That would mean pulling librseq into
> > the kernel based on the prior discussion. It doesn't look like that
> > has happened yet.
>
> Yeah, or just don't bother supporting heterogeneous systems with this
> for now.
But the people are asking for it. :)
> > Why not just drop the undef hook? For heterogeneous systems, we
> > require userspace to pin itself to cores for a specific PMU. See patch
> > 9. If userspace fails to do that, then it gets to keep the pieces.
>
> Dropping it works too!
Great. I asked Mark R to comment in case I'm forgetting some other reason.
Rob