Re: [PATCH] perf: arm_spe: Add barrier before enabling profiling buffer
From: James Clark
Date: Tue Feb 03 2026 - 05:47:03 EST
On 02/02/2026 7:03 pm, Will Deacon wrote:
On Fri, Jan 23, 2026 at 04:03:53PM +0000, James Clark wrote:
The Arm ARM known issues document [1] states that the architecture will
be relaxed so that the profiling buffer must be correctly configured
when ProfilingBufferEnabled() && !SPEProfilingStopped() &&
PMBLIMITR_EL1.FM != DISCARD:
R24557
While the Profiling Buffer is enabled, profiling is not stopped, and
Discard mode is not enabled, all of the following must be true:
* The current write pointer must be at least one sample record below
the write limit pointer.
The same relaxation also says that writes may be completely ignored:
When the Profiling Buffer is enabled, profiling is not stopped, and
Discard mode is not enabled, the PE might ignore a direct write to any
of the following Profiling Buffer registers, other than a direct write
to PMBLIMITR_EL1 that clears PMBLIMITR_EL1.E from 1 to 0:
* The current write pointer, PMBPTR_EL1.
* The Limit pointer, PMBLIMITR_EL1.
* PMBSR_EL1.
Thinking about this some more, does that mean that the direct write to
PMBPTR_EL1 performs an indirect read of PMBLIMITR_EL1 so that it can
determine the write-ignore semantics? If so, doesn't that mean that
we'll get order against a subsequent direct write of PMBLIMITR_EL1
without an ISB thanks to table "D24-1 Synchronization requirements"
which says that an indirect read followed by a direct write doesn't
require synchronisation?
There's also a sentence above the table stating:
"Direct writes to System registers are not allowed to affect any
instructions appearing in program order before the direct write."
so after all that, I'm not really sure why the ISB is required.
Will
We were under the impression that this was required for the SPU as it is treated as a separate entity than the PE.
In "D17.9 Synchronization and Statistical Profiling" there is:
INDWCG
A Context Synchronization event guarantees that a direct write to a
System register made by the PE in program order before the Context
synchronization event are observable by indirect reads and indirect
writes of the same System register made by a profiling operation
relating to a sampled operation in program order after the Context
synchronization event.
That specifically mentions an indirect read following a direct write, which seems to contradict D24-1. Although I thought this is a special case for SPE.