Re: [PATCH] fix init_idle() locking problem

From: Bjorn Helgaas
Date: Tue May 11 2004 - 11:15:06 EST


On Tuesday 11 May 2004 3:39 am, Ingo Molnar wrote:
> * Bjorn Helgaas <bjorn.helgaas@xxxxxx> wrote:
> > init_idle() removes "idle" from its runqueue, but there's a window
> > between looking up the runqueue and locking it, where another CPU can
> > move "idle" to a different runqueue, i.e., via load_balance().
>
> the sched-domains patches that are now in BK solve this problem in a
> cleaner way: the rule is that no cross-CPU balancing is allowed until
> all CPUs have booted up.

Thanks for fixing this the right way. I'll try it out in the next couple
days. Nice coincidence that all the sched-domains stuff went in the
same day :-)

> [ btw., your patch does not seem to be correct anyway - if an online CPU
> 'steals' the new idle task then it will also most likely run it - and
> that is disastrous for any CLONE_IDLETASK task. (e.g. on x86 the EIP has
> random content, most likely crashing that CPU.) ]

I'm sure my patch wasn't complete, but it did address something
that still looks strange to me. The current code is:

void __init init_idle(task_t *idle, int cpu)
{
runqueue_t *idle_rq = cpu_rq(cpu), *rq = cpu_rq(task_cpu(idle));
unsigned long flags;

local_irq_save(flags);
double_rq_lock(idle_rq, rq);

idle_rq->curr = idle_rq->idle = idle;

The fact that we look up the runqueue, lock it, and use it without
rechecking that we got the correct one certainly is suspicious at
the micro point of view.

But, if you say this window is not a problem anymore because of the
larger design, I'll take your word for it.

Bjorn
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/