Re: [RFC][PATCH 17/18] sched: Move the second half of ttwu() tothe remote cpu

From: Peter Zijlstra
Date: Thu Jan 06 2011 - 10:09:29 EST


On Wed, 2011-01-05 at 22:07 +0100, Oleg Nesterov wrote:
> On 01/04, Peter Zijlstra wrote:
> >
> > Now that we've removed the rq->lock requirement from the first part of
> > ttwu() and can compute placement without holding any rq->lock, ensure
> > we execute the second half of ttwu() on the actual cpu we want the
> > task to run on.
>
> Damn. I am reading this patch back and forth, many times, and
> I am not able to find any problem. So sad!

:-)

> I'll try to read it once again with the fresh head, though ;)
> I also have a couple of very minor nits... In particular, perhaps
> TASK_WAKING can die...

I think it might.. I'll do a patch at the end removing it, lets see what
happens.

> Just one question for today,
>
> > try_to_wake_up(struct task_struct *p, unsigned int state, int wake_flags)
> > {
> > - int cpu, this_cpu, success = 0;
> > unsigned long flags;
> > - struct rq *rq;
> > -
> > - this_cpu = get_cpu();
> > + int cpu, success = 0;
> >
> > smp_wmb();
> > raw_spin_lock_irqsave(&p->pi_lock, flags);
> > if (!(p->state & state))
> > goto out;
> >
> > + success = 1; /* we're going to change ->state */
> > cpu = task_cpu(p);
> >
> > - if (p->on_rq) {
> > - rq = __task_rq_lock(p);
> > - if (p->on_rq)
> > - goto out_running;
> > - __task_rq_unlock(rq);
> > - }
> > + if (p->on_rq && ttwu_remote(p, wake_flags))
> > + goto stat;
> > +
> > + p->sched_contributes_to_load = !!task_contributes_to_load(p);
> > + p->state = TASK_WAKING;
> >
> > #ifdef CONFIG_SMP
> > + /*
> > + * If the owning (remote) cpu is still in the middle of schedule() with
> > + * this task as prev, wait until its done referencing the task.
> > + */
> > while (p->on_cpu)
> > cpu_relax();
>
> Don't we need rmb() after that?
>
> No, I am not saying it _is_ needed. I am asking.

OK, so I've been thinking and all I can say is that I've got a head-ache
and that I _think_ you're right.. while it doesn't matter for the
observance of p->on_cpu itself, we don't want other reads to be done
before we do indeed observe it.

> (but need_migrate_task() can avoid on_cpu+rmb afaics)

I think so too, I'll stick a patch to that effect at the end.

Current queue lives at:
http://programming.kicks-ass.net/sekrit/patches-ttwu.tar.bz2


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/