Re: [PATCH] perf: arm_spe: Add barrier before enabling profiling buffer

From: James Clark

Date: Thu Feb 19 2026 - 07:08:43 EST




On 06/02/2026 9:50 am, James Clark wrote:


On 03/02/2026 11:07 am, Will Deacon wrote:
On Tue, Feb 03, 2026 at 10:46:37AM +0000, James Clark wrote:


On 02/02/2026 7:03 pm, Will Deacon wrote:
On Fri, Jan 23, 2026 at 04:03:53PM +0000, James Clark wrote:
The Arm ARM known issues document [1] states that the architecture will
be relaxed so that the profiling buffer must be correctly configured
when ProfilingBufferEnabled() && !SPEProfilingStopped() &&
PMBLIMITR_EL1.FM != DISCARD:

    R24557

    While the Profiling Buffer is enabled, profiling is not stopped, and
    Discard mode is not enabled, all of the following must be true:

    * The current write pointer must be at least one sample record below
      the write limit pointer.

The same relaxation also says that writes may be completely ignored:

    When the Profiling Buffer is enabled, profiling is not stopped, and
    Discard mode is not enabled, the PE might ignore a direct write to any
    of the following Profiling Buffer registers, other than a direct write
    to PMBLIMITR_EL1 that clears PMBLIMITR_EL1.E from 1 to 0:

    * The current write pointer, PMBPTR_EL1.
    * The Limit pointer, PMBLIMITR_EL1.
    * PMBSR_EL1.

Thinking about this some more, does that mean that the direct write to
PMBPTR_EL1 performs an indirect read of PMBLIMITR_EL1 so that it can
determine the write-ignore semantics? If so, doesn't that mean that
we'll get order against a subsequent direct write of PMBLIMITR_EL1
without an ISB thanks to table "D24-1 Synchronization requirements"
which says that an indirect read followed by a direct write doesn't
require synchronisation?

There's also a sentence above the table stating:

"Direct writes to System registers are not allowed to affect any
   instructions appearing in program order before the direct write."

so after all that, I'm not really sure why the ISB is required.

Will

We were under the impression that this was required for the SPU as it is
treated as a separate entity than the PE.

In "D17.9 Synchronization and Statistical Profiling" there is:

   INDWCG

   A Context Synchronization event guarantees that a direct write to a
   System register made by the PE in program order before the Context
   synchronization event are observable by indirect reads and indirect
   writes of the same System register made by a profiling operation
   relating to a sampled operation in program order after the Context
   synchronization event.

That specifically mentions an indirect read following a direct write, which
seems to contradict D24-1. Although I thought this is a special case for
SPE.

My reading of the the text above is that it is covering the direct write
-> indirect read case, whereas I think the case in the SPE driver that
we're considering for your patch is when we have an indirect read
followed by a direct write.

Will

Yeah, and that text also only applies to "profiling operations", not writes to PMBPTR and PMBLIMITR.

Upon further investigation you are correct about the isb() not being required, even with the new relaxation. Seems like we just accepted that the relaxation required some change to the driver without really thinking about it. But yeah thanks for looking in detail and catching it.

So we can drop this now. Sorry for the noise.

James


Hi Will,

I'm back to drag this up again. So I think all of the above discussion relies on the ordering given by the indirect read needed for the "might ignore a direct write..." part. But it's _might_ ignore a direct write, it's possible for an implementation to not do that, so there are two possible implementations:

#1 Where there is an indirect read to give the write ignore outcome
#2 Where there is no write ignore outcome so it doesn't require an
indirect read

For #2 there's nothing to force the ordering. We're writing to two different registers (PMBPTR_EL1 and PMBLIMITR_EL1) and we have to have the PMBLIMITR_EL1 write come second for the buffer to be considered configured correctly. For example if the old value of PMBPTR_EL1 is higher than the new PMBLIMITR_EL1 and the write to PMBLIMITR_EL1 happens first then it's misconfigured. That's why we think we need the isb() here.

Thanks
James