Re: [PATCH v8 2/2] sched/rt: Trying to push current task when target disable migrating

From: Valentin Schneider
Date: Thu Apr 06 2023 - 08:00:05 EST


On 29/08/22 01:03, Schspa Shi wrote:
> When the task to push disable migration, retry to push the current
> running task on this CPU away, instead doing nothing for this migrate
> disabled task.
>
> CC: Valentin Schneider <vschneid@xxxxxxxxxx>
> Signed-off-by: Schspa Shi <schspa@xxxxxxxxx>
> Reviewed-by: Steven Rostedt (Google) <rostedt@xxxxxxxxxxx>
> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@xxxxxxx>
> ---
> kernel/sched/core.c | 13 ++++++++++++-
> kernel/sched/deadline.c | 9 +++++++++
> kernel/sched/rt.c | 8 ++++++++
> 3 files changed, 29 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index ee28253c9ac0c..056b336c29e70 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -2503,8 +2503,19 @@ int push_cpu_stop(void *arg)
> if (p->sched_class->find_lock_rq)
> lowest_rq = p->sched_class->find_lock_rq(p, rq);
>
> - if (!lowest_rq)
> + if (!lowest_rq) {
> + /*
> + * The find_lock_rq function above could have released the rq
> + * lock and allow p to schedule and be preempted again, and
> + * that lowest_rq could be NULL because p now has the
> + * migrate_disable flag set and not because it could not find
> + * the lowest rq. So we must check task migration flag again.
> + */
> + if (unlikely(is_migration_disabled(p)))
> + p->migration_flags |= MDF_PUSH;
> +

Given p has to be on this rq initially, this implies p being migrated away
to become migration_disabled() (it *can't* be scheduled while the stopper
is running), in which case it's not on this rq anymore, so do we care?

> goto out_unlock;
> + }
>
> // XXX validate p is still the highest prio task
> if (task_rq(p) == rq) {
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index e7eea6cde5cb9..c8055b978dbc3 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -2340,6 +2340,15 @@ static int push_dl_task(struct rq *rq)
> */
> task = pick_next_pushable_dl_task(rq);
> if (task == next_task) {
> + /*
> + * If next task has now disabled migrating, see if we
> + * can do resched_curr().
> + */
> + if (unlikely(is_migration_disabled(task))) {
> + put_task_struct(next_task);
> + goto retry;
> + }
> +
> /*
> * The task is still there. We don't try
> * again, some other CPU will pull it when ready.
> diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
> index 57e8cd5c9c267..381ec05eb2701 100644
> --- a/kernel/sched/rt.c
> +++ b/kernel/sched/rt.c
> @@ -2139,6 +2139,14 @@ static int push_rt_task(struct rq *rq, bool pull)
> */
> task = pick_next_pushable_task(rq);
> if (task == next_task) {
> + /*
> + * If next task has now disabled migrating, see if we
> + * can push the current task.
> + */
> + if (unlikely(is_migration_disabled(task))) {
> + put_task_struct(next_task);
> + goto retry;
> + }

Similarly here, if the task has been through a switch-in / switch-out
cycle, then at least for RT we'd have

set_next_task_rt()
`\
rt_queue_push_tasks()

which will take care of it.

If the task is preempted by e.g. a DL task, then the retry would fail on

(next_task->prio < rq->curr->prio)

and I'm thinking the same logic applies to the deadline.c. IOW, it looks
like we're already doing the right thing here when the task gets scheduled
out, so I don't think we need any of this.

> /*
> * The task hasn't migrated, and is still the next
> * eligible task, but we failed to find a run-queue
> --
> 2.37.2