Re: [PATCH] cpuidle: Deny idle entry when CPU already have IPI interrupt pending

From: Maulik Shah (mkshah)

Date: Wed Mar 25 2026 - 01:37:19 EST




On 3/24/2026 9:37 PM, Rafael J. Wysocki wrote:
> On Mon, Mar 23, 2026 at 1:13 PM Maulik Shah (mkshah)
> <maulik.shah@xxxxxxxxxxxxxxxx> wrote:
>>
>>
>>
>> On 3/20/2026 11:59 PM, Rafael J. Wysocki wrote:
>>> On Mon, Mar 16, 2026 at 8:38 AM Maulik Shah
>>> <maulik.shah@xxxxxxxxxxxxxxxx> wrote:
>>>>
>>>> CPU can get IPI interrupt from another CPU while it is executing
>>>> cpuidle_select() or about to execute same. The selection do not account
>>>> for pending interrupts and may continue to enter selected idle state only
>>>> to exit immediately.
>>>>
>>>> Example trace collected when there is cross CPU IPI.
>>>>
>>>> [000] 154.892148: sched_waking: comm=sugov:4 pid=491 prio=-1 target_cpu=007
>>>> [000] 154.892148: ipi_raise: target_mask=00000000,00000080 (Function call interrupts)
>>>> [007] 154.892162: cpu_idle: state=2 cpu_id=7
>>>> [007] 154.892208: cpu_idle: state=4294967295 cpu_id=7
>>>> [007] 154.892211: irq_handler_entry: irq=2 name=IPI
>>>> [007] 154.892211: ipi_entry: (Function call interrupts)
>>>> [007] 154.892213: sched_wakeup: comm=sugov:4 pid=491 prio=-1 target_cpu=007
>>>> [007] 154.892214: ipi_exit: (Function call interrupts)
>>>>
>>>> This impacts performance and the above count increments.
>>>>
>>>> commit ccde6525183c ("smp: Introduce a helper function to check for pending
>>>> IPIs") already introduced a helper function to check the pending IPIs and
>>>> it is used in pmdomain governor to deny the cluster level idle state when
>>>> there is a pending IPI on any of cluster CPUs.
>>>
>>> You seem to be overlooking the fact that resched wakeups need not be
>>> signaled via IPIs, but they may be updates of a monitored cache line.
>>>
>>>> This however does not stop CPU to enter CPU level idle state. Make use of
>>>> same at CPUidle to deny the idle entry when there is already IPI pending.
>>>>
>>>> With change observing glmark2 [1] off screen scores improving in the range
>>>> of 25% to 30% on Qualcomm lemans-evk board which is arm64 based having two
>>>> clusters each with 4 CPUs.
>>>>
>>>> [1] https://github.com/glmark2/glmark2
>>>>
>>>> Signed-off-by: Maulik Shah <maulik.shah@xxxxxxxxxxxxxxxx>
>>>> ---
>>>> drivers/cpuidle/cpuidle.c | 3 +++
>>>> 1 file changed, 3 insertions(+)
>>>>
>>>> diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
>>>> index c7876e9e024f9076663063ad21cfc69343fdbbe7..c88c0cbf910d6c2c09697e6a3ac78c081868c2ad 100644
>>>> --- a/drivers/cpuidle/cpuidle.c
>>>> +++ b/drivers/cpuidle/cpuidle.c
>>>> @@ -224,6 +224,9 @@ noinstr int cpuidle_enter_state(struct cpuidle_device *dev,
>>>> bool broadcast = !!(target_state->flags & CPUIDLE_FLAG_TIMER_STOP);
>>>> ktime_t time_start, time_end;
>>>>
>>>> + if (cpus_peek_for_pending_ipi(drv->cpumask))
>>>> + return -EBUSY;
>>>> +
>>>
>>> So what if the driver handles all CPUs in the system and there are
>>> many of them (say ~500) and if IPIs occur rarely (because resched
>>> events are not IPIs)?
>>
>> Missed the case of driver handling multiple CPUs,
>> In v2 would fix this as below, which checks pending IPI on single
>> CPU trying to enter idle.
>>
>> if (cpus_peek_for_pending_ipi(cpumask_of(dev->cpu)))
>
> And the for_each_cpu() loop in cpus_peek_for_pending_ipi() would then
> become useless overhead, wouldn't ir?

Given that mask in loop for_each_cpu(cpu, mask) will have a single cpu, overhead should be minor.
or May be we can have new API for a single CPU case.

>
>> I see IPIs do occur often, in the glmark2 offscreen case
>> mentioned in commit text, out of total ~12.2k IPIs across all 8 CPUs,
>> ~9.6k are function call IPIs, ~2k are IRQ work IPIs, ~560 Timer broadcast
>> IPIs while rescheduling IPIs are only 82.
>
> So how many of those IPIs actually wake up CPUs from idle prematurely?

282 times out of total 12.2k IPIs (~2.3%) hitting the newly added condition.

Thanks,
Maulik