Re: [RFC PATCH] sched/fair: Choose the CPU where short task is running during wake up

From: Peter Zijlstra
Date: Fri Sep 16 2022 - 07:45:39 EST


On Fri, Sep 16, 2022 at 12:54:07AM +0800, Chen Yu wrote:
> And the rq lock bottleneck is composed of two paths(perf profile):
>
> (path1):
> raw_spin_rq_lock_nested.constprop.0;
> try_to_wake_up;
> default_wake_function;
> autoremove_wake_function;
> __wake_up_common;
> __wake_up_common_lock;
> __wake_up_sync_key;
> pipe_write;
> new_sync_write;
> vfs_write;
> ksys_write;
> __x64_sys_write;
> do_syscall_64;
> entry_SYSCALL_64_after_hwframe;write

Can you please addr2line -i the raw_spin_rq_lock callsite so we know which is
the one causing grief?

Specifically; I'm worried about PSI, psi_ttwu_dequeue() can cause ttwu()
to take _2_ rq->lock, which absolutely blows for this case.