Re: [PATCH v10 00/12] barrier: Add smp_cond_load_{relaxed,acquire}_timeout()

From: David Laight

Date: Wed Mar 25 2026 - 12:15:17 EST


On Wed, 25 Mar 2026 13:53:50 +0000
Catalin Marinas <catalin.marinas@xxxxxxx> wrote:

> On Tue, Mar 17, 2026 at 09:17:05AM +0000, David Laight wrote:
> > On Mon, 16 Mar 2026 23:53:22 -0700
> > Ankur Arora <ankur.a.arora@xxxxxxxxxx> wrote:
> > > David Laight <david.laight.linux@xxxxxxxxx> writes:
> > > > On arm64 I think you could use explicit sev and wfe - but that will wake all
> > > > 'sleeping' cpu; and you may not want the 'thundering herd'.
> > >
> > > Wouldn't we still have the same narrow window where the CPU disregards the IPI?
> >
> > You need a 'sevl' in the interrupt exit path.
>
> No need to, see the rule below in
> https://developer.arm.com/documentation/ddi0487/maa/2983-beijhbbd:
>
> R_XRZRK
> The Event Register for a PE is set by any of the following:
> [...]
> - An exception return.
>

It is a shame the pages for the SEV and WFE instructions don't mention that.
And the copy I found doesn't have working hyperlinks to any other sections.
(Not even references to related instructions...)

You do need to at least comment that the "msr s0_3_c1_c0_0, %[ecycles]" is
actually WFET.
Is that using an absolute cycle count?
If so does it work if the time has already passed?
If it is absolute do you need to recalculate it every time around the loop?
__delay_cycles() contains guard(preempt_notrace()). I haven't looked what
that does but is it needed here since preemption is disabled?

Looking at the code I think the "sevl; wfe" pair should be higher up.
If they were before the evaluation of the condition then an IPI that set
need_resched() just after it was tested would cause a wakeup.
Clearly that won't help if the condition does anything that executes 'wfe'
and won't sleep if it sets the event - but I suspect they are unlikely.

I also wonder how long it takes the cpu to leave any low power state.
We definitely found that was an issue on some x86 cpu and had to both
disable the lowest low power state and completely rework some wakeup
code that really wanted a 'thundering herd' rather than the very gentle
'bring each cpu out of low power one at a time' that cv_broadcast()
gave it.

David