Re: [RFC 1/1] sched: Skip redundant operations for proxy tasks needing return migration

From: hupu
Date: Thu Apr 10 2025 - 05:52:12 EST


Hi John:
Thank you for your feedback.

On Thu, Apr 10, 2025 at 10:41 AM John Stultz <jstultz@xxxxxxxxxx> wrote:
>
> Unfortunately this patch crashes pretty quickly in my testing. The
> first issue was proxy_needs_return() calls deactivate_task() w/
> DEQUEUE_NOCLOCK, which causes warnings when the update_rq_clock()
> hasn't been called. Preserving the update_rq_clock() line before
> checking proxy_needs_return() avoided that issue, but then I saw hangs
> during bootup, which I suspect is due to us shortcutting over the
> sched_delayed case.
>
> Moving the proxy_needs_return above the if(task_on_cpu())
> wakeup_preempt() logic booted ok, but I'm still a little hesitant of
> what side-effects that might cause.

I’m sorry for the confusion caused by this patch. Here is the
rationale behind my approach:

To ensure that donor tasks can get a suitable CPU and avoid negative
impacts from the Proxy-Execution on load balancing,
`proxy_needs_return()` in `ttwu_runnable()` should return false for
all donor tasks. This allows `try_to_wake_up()` to use `set_task_cpu`
to reselect a CPU for the donor tasks, unless the donor is already
running on a CPU.

This patch worked correctly on my QEMU-based test platform, it seems
our testing methods might differ. Could you please share the details
of your testing environment and methodology? I’ll try to replicate the
issue using the same approach.

In the meantime, I will carefully revisit the logic in this patch to
ensure its correctness and consistency. Once I’ve completed the
review, I look forward to further discussing the details with you.

Thank you again for your valuable feedback!

Best regards,
hupu