Re: [RFC 1/1] sched: Remove unreliable wake_cpu check in proxy_needs_return

From: hupu
Date: Wed Apr 09 2025 - 02:56:47 EST


Hi Madadi Vineeth Reddy,
Thank you for your prompt review, and my sincere apologies for not
providing the repository and branch details clearly in my initial
submission. This was an oversight on my part.

The patch is based on the following repository:
https://github.com/johnstultz-work/linux-dev.git

Specifically, the changes are present in these Proxy Execution
development branches:
a) proxy-exec-WIP
b) proxy-exec-v15-WIP

The proxy_needs_return function resides in these branches as part of
the ongoing Proxy Execution feature work.
Please let me know if you need additional information or further
clarifications. Thank you again for your time and guidance.

Best regards,
hupu


Madadi Vineeth Reddy <vineethr@xxxxxxxxxxxxx> 于2025年4月9日周三 14:01写道:
>
> On 07/04/25 19:16, hupu wrote:
> > The (p->wake_cpu != cpu_of(rq)) check in proxy_needs_return() is unsafe
> > during early wakeup phase. When called via ttwu_runnable() path:
> >
> > |-- try_to_wake_up
> > |-- ttwu_runnable
> > |-- proxy_needs_return //we are here
> > |-- select_task_rq
> > |-- set_task_cpu //set p->wake_cpu here
> > |-- ttwu_queue
> >
> > The p->wake_cpu at this point reflects the CPU where donor last ran before
> > blocking, not the target migration CPU. During blocking period:
> > 1. CPU affinity may have been changed by other threads
> > 2. Proxy migrations might have altered the effective wake_cpu
> > 3. set_task_cpu() hasn't updated wake_cpu yet in this code path
> >
> > This makes the wake_cpu vs current CPU comparison meaningless and potentially
> > dangerous. Rely on find_proxy_task()'s later migration logic to handle CPU
> > placement based on up-to-date affinity and scheduler state.
> >
> > Signed-off-by: hupu <hupu.gm@xxxxxxxxx>
> > ---
> > kernel/sched/core.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index 3c4ef4c71cfd..ca4ca739eb85 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -4047,7 +4047,7 @@ static inline bool proxy_needs_return(struct rq *rq, struct task_struct *p)
> >
> > raw_spin_lock(&p->blocked_lock);
> > if (__get_task_blocked_on(p) && p->blocked_on_state & BO_NEEDS_RETURN) {
> > - if (!task_current(rq, p) && (p->wake_cpu != cpu_of(rq))) {
> > + if (!task_current(rq, p)) {
> > if (task_current_donor(rq, p)) {
> > put_prev_task(rq, p);
> > rq_set_donor(rq, rq->idle);
>
> Which tree is this change based on? I don't see `proxy_needs_return` in tip/sched/core.
>
> Thanks,
> Madadi Vineeth Reddy
>