Re: [RFC PATCH V2 1/3] sched/fair: Fixup-wake_up_sync-vs-DELAYED_DEQUEUE

From: Tianchen Ding
Date: Wed Mar 19 2025 - 05:06:16 EST


Hi Xuewen,

On 3/3/25 6:52 PM, Xuewen Yan wrote:
Delayed dequeued feature keeps a sleeping task enqueued until its
lag has elapsed. As a result, it stays also visible in rq->nr_running.
So when in wake_affine_idle(), we should use the real running-tasks
in rq to check whether we should place the wake-up task to
current cpu.
On the other hand, add a helper function to return the nr-delayed.

Fixes: 152e11f6df29 ("sched/fair: Implement delayed dequeue")
Signed-off-by: Xuewen Yan <xuewen.yan@xxxxxxxxxx>

We noticed that your patch can fix a regression introduced by DELAY_DEQUEUE in lmbench lat_ctx.

Here's the performance data running
`./lat_ctx -P $(nproc) 96`
on an intel SPR server with 192 CPUs (smaller is better):

DELAY_DEQUEUE 9.71
NO_DELAY_DEQUEUE 4.02
DELAY_DEQUEUE + this_patch 3.86

Also on an aarch64 server with 128 CPUs:

DELAY_DEQUEUE 14.82
NO_DELAY_DEQUEUE 5.62
DELAY_DEQUEUE + this_patch 4.66


We found the lmbench lat_ctx regression when enabling DELAY_DEQUEUE, with cpu-migrations increasing more than 100 times, higher nr_wakeups_migrate, nr_wakeups_remote, nr_wakeups_affine, nr_wakeups_affine_attempts and lower nr_wakeups_local.

We think this benchmark prefers waker and wakee staying on the same cpu, but WA_IDLE failed to reach this due to sched_delay noise. So your patch does fix it.

Feel free to add
Reviewed-and-tested-by: Tianchen Ding <dtcccc@xxxxxxxxxxxxxxxxx>

Thanks.