Re: [RFC PATCH v15 6/7] sched: Fix proxy/current (push,pull)ability

From: K Prateek Nayak
Date: Sat Mar 15 2025 - 12:07:03 EST


Hello John,

On 3/15/2025 10:40 AM, John Stultz wrote:
[..snip..]
@@ -6856,6 +6873,10 @@ static void __sched notrace __schedule(int sched_mode)
* changes to task_struct made by pick_next_task().
*/
RCU_INIT_POINTER(rq->curr, next);
+
+ if (!task_current_donor(rq, next))
+ proxy_tag_curr(rq, next);

I don't see any dependency on rq->curr for task_current_donor() check.
Could this check be moved outside of the if-else block to avoid
duplicating in both places since rq_set_donor() was called just after
pick_next_task() or am I missing something?

So this check is just looking to see if next is the same as the
selected rq->donor (what pick_next_task() chose).

If so, nothing to do, same as always.

But If not (so we are proxying in this case), we need to call
proxy_tag_curr() because we have to make sure both the donor and the
proxy are not on a sched-classes pushable list.

This is because the logic around pick_next_task() calls
set_next_task() on the returned donor task, and in the sched-class
code, (for example RT) that logic will remove the chosen donor task
from the pushable list.

But when we find a proxy task to run on behalf of the donor, the
problem is that the proxy might be on the sched-class' pushable list.
So if we are proxying, we do a dequeue and enqueue pair, which allows
us to re-evaluate if the task is rq->curr, which will prevent it from
being added to any such pushable list. This avoids the potential of
the balance callbacks trying to migrate the rq->curr under us.

Thanks so much for the review and the question! Let me know if that
makes any more sense, or if you have suggestions on how I could better
explain it in the commit message to help.

Thanks a ton for clarifying. I found the enqueue_task_rt() bits from
Patch 5 and then it made sense.

P.S. Could the enqueue_task_rt() bits be moved to this patch since it
fits here better?

I couldn't see the dependency for the enqueue bits in Patch 5 since on
finding a "blocked_on" task, the logic simply dequeues it and since
proxy_resched_idle() will nuke the rq->{curr,donor} reference before
that, it should be safe to move those bits here unless I missed
something again :)


Appreciate it!
-john

--
Thanks and Regards,
Prateek