Re: [PATCH v8 01/11] cpuidle/poll_state: poll via smp_cond_load_relaxed()
From: Ankur Arora
Date: Tue Oct 15 2024 - 17:56:05 EST
Catalin Marinas <catalin.marinas@xxxxxxx> writes:
> On Tue, Oct 15, 2024 at 10:17:13AM -0700, Christoph Lameter (Ampere) wrote:
>> On Tue, 15 Oct 2024, Catalin Marinas wrote:
>> > > Setting of need_resched() from another processor involves sending an IPI
>> > > after that was set. I dont think we need to smp_cond_load_relaxed since
>> > > the IPI will cause an event. For ARM a WFE would be sufficient.
>> >
>> > I'm not worried about the need_resched() case, even without an IPI it
>> > would still work.
>> >
>> > The loop_count++ side of the condition is supposed to timeout in the
>> > absence of a need_resched() event. You can't do an smp_cond_load_*() on
>> > a variable that's only updated by the waiting CPU. Nothing guarantees to
>> > wake it up to update the variable (the event stream on arm64, yes, but
>> > that's generic code).
>>
>> Hmm... I have WFET implementation here without smp_cond modelled after
>> the delay() implementation ARM64 (but its not generic and there is
>> an additional patch required to make this work. Intermediate patch
>> attached)
>
> At least one additional patch ;). But yeah, I suggested hiding all this
> behind something like smp_cond_load_timeout() which would wait on
> current_thread_info()->flags but with a timeout. The arm64
> implementation would follow some of the logic in __delay(). Others may
> simply poll with cpu_relax().
>
> Alternatively, if we get an IPI anyway, we can avoid smp_cond_load() and
> rely on need_resched() and some new delay/cpu_relax() API that waits for
> a timeout or an IPI, whichever comes first. E.g. cpu_relax_timeout()
> which on arm64 it's just a simplified version of __delay() without the
> 'while' loops.
AFAICT when polling (which we are since poll_idle() calls
current_set_polling_and_test()), the scheduler will elide the IPI
by remotely setting the need-resched bit via set_nr_if_polling().
Once we stop polling then the scheduler should take the IPI path
because call_function_single_prep_ipi() will fail.
--
ankur