Re: [PATCH 2/2] sched/core: Consider afffinity constrain when yield to a task

From: Peter Zijlstra
Date: Mon Jun 11 2018 - 04:39:08 EST


On Mon, Jun 11, 2018 at 03:38:50PM +0800, Wanpeng Li wrote:
> From: Wanpeng Li <wanpengli@xxxxxxxxxxx>
>
> Consider the task afffinity constrain when yield to a task.
>
> Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx>
> Cc: Radim KrÄmÃÅ <rkrcmar@xxxxxxxxxx>
> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> Cc: Ingo Molnar <mingo@xxxxxxxxxx>
> Signed-off-by: Wanpeng Li <wanpengli@xxxxxxxxxxx>
> ---
> kernel/sched/core.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 092f7c4..11001ff 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -5102,6 +5102,9 @@ int __sched yield_to(struct task_struct *p, bool preempt)
> if (task_running(p_rq, p) || p->state)
> goto out_unlock;
>
> + if (!cpumask_test_cpu(smp_processor_id(), &p->cpus_allowed))
> + goto out_unlock;
> +
> yielded = curr->sched_class->yield_to_task(rq, p, preempt);
> if (yielded) {
> schedstat_inc(rq->yld_count);

I'm confused... why?

So yield_to(@p) is documented as yielding @curr and getting @p to run
'sooner'. If they're on the same rq, yay, that means we'll likely switch
from @curr to @p, however if they're not on the same rq, it should still
work, except we'll reschedule 2 CPUs.

Look at the code, yield_to() will lock 2 rqs: rq and p_rq.

First if verifies p_rq == task_rq(p) (p could've been migrated while we
were waiting to acquire the lock) if this is not so, we unlock and try
again.

Then we check trivial things like actually having a ->yield_to and @curr
and @p being of the same class.

Then we check if @p is in fact already running or not runnable at all,
if either, obviously nothing to do.

So now, we have rq and p_rq locked, they need not be the same and we'll
call ->yield_to(), on success we'll force reschedule p_rq, unlock the
rqs and reschedule ourself.

So far, nothing requires @p to be runnable on the current CPU.

So let us look at yield_to_task_fair() the only yield_to implementation:

That again checks if @p is in fact runnable, if not, nothing to do.

Then it calls set_next_buddy(&p->se), which will mark @p more likely to
get picked on its rq (p_rq) _not_ our rq. Note how set_next_buddy() uses
cfs_rq_of().

Then it yields @curr on the current cpu.


Again, nothing cares if @curr and @p are on the same CPU and if @curr is
allowed to run on the current CPU -- there are no migrations.


So.. why?!