Re: [PATCH] perf: arm_spe: Add barrier before enabling profiling buffer
From: James Clark
Date: Thu Feb 19 2026 - 09:15:37 EST
On 19/02/2026 2:03 pm, Will Deacon wrote:
On Thu, Feb 19, 2026 at 01:51:26PM +0000, James Clark wrote:
On 19/02/2026 12:57 pm, Will Deacon wrote:
On Thu, Feb 19, 2026 at 12:08:27PM +0000, James Clark wrote:
I'm back to drag this up again. So I think all of the above discussion
relies on the ordering given by the indirect read needed for the "might
ignore a direct write..." part. But it's _might_ ignore a direct write, it's
possible for an implementation to not do that, so there are two possible
implementations:
#1 Where there is an indirect read to give the write ignore outcome
#2 Where there is no write ignore outcome so it doesn't require an
indirect read
For #2 there's nothing to force the ordering. We're writing to two different
registers (PMBPTR_EL1 and PMBLIMITR_EL1) and we have to have the
PMBLIMITR_EL1 write come second for the buffer to be considered configured
correctly. For example if the old value of PMBPTR_EL1 is higher than the new
PMBLIMITR_EL1 and the write to PMBLIMITR_EL1 happens first then it's
misconfigured. That's why we think we need the isb() here.
I thought profiling was disabled in these cases, so why is it
misconfigured?
If it is misconfigured, what can go wrong given that we're either stopped
or pmscr is clear?
Will
It is stopped in the interrupt handler because PMBSR_EL1.S = 1. But in
arm_spe_pmu_start(), PMBSR_EL1.S = 0 so it's not stopped. And with
PMBPTR_EL1 still being set to the value from the last session it could be
higher than PMBLIMITR_EL1.
PMSCR_EL1 doesn't affect SPEProfilingStopped(), it's only:
boolean stopped = (PMBSR_EL1.S == '1');
The conditions for when the buffer needs to be configured correctly from
R24557 are:
ProfilingBufferEnabled() && !SPEProfilingStopped() &&
PMBLIMITR_EL1.FM != DISCARD
I think even if PMSCR_EL1 is clear you can still get a buffer management
error, even if no samples were going to be written into the buffer. It just
says:
While the Profiling Buffer is enabled, profiling is not stopped, and
Discard mode is not enabled, all of the following must be true:
* The current write pointer must be at least one sample record below
the write limit pointer.
... but doesn't that mean that R24557 is a breaking change to the
architecture? The current Arm ARM doesn't appear to require this,
existing software doesn't honour it so why should we hack extra barriers
into Linux?
Will
Yes I suppose it is. The current Arm ARM doesn't require it, but R24577 is in the "known issues" document for the Arm ARM, so it's saying the ARM is incorrect here and you shouldn't trust what it says.
Arm Architecture Reference Manual for A-profile architecture: Known
issues
This document includes the Known Issues for the following documents:
Arm Architecture Reference Manual for A-profile architecture
(DDI0487)
Presumably the fix will make it into the Arm ARM eventually. There is certainly a discussion to be had about whether this is a good change to make or not, but at this point I'm only concerned with making the driver correct using all of the information that is currently published.
I suppose there is a chance that this could be deleted from the known issues and not make it into a future Arm ARM. TBH I have no experience or feeling to say how likely that would be.
Purely from a software point of view, ignoring the new rule despite it being published here doesn't feel like it would be right.