Re: 2.6.8.1-mm1

From: Rusty Russell
Date: Tue Aug 17 2004 - 02:22:27 EST


On Tue, 2004-08-17 at 15:59, Nathan Lynch wrote:
> I can consistently hit this BUG_ON in migration_call's CPU_DEAD handling
> by doing:
>
> while true ; do
> echo 0 > /sys/devices/system/cpu/cpu1/online
> echo 1 > /sys/devices/system/cpu/cpu1/online
> done
>
> and then starting a kernel build. It seems to take less than 20
> minutes. I can also recreate it on ppc64, but only with
> CONFIG_PREEMPT=y (I haven't tried without preempt on i386 yet).

Hmm, can you figure out what patch in -mm breaks it?

Looking through 2.6.8.1-mm1, I see this code which doesn't make sense:


if (likely(cpu == this_cpu)) {
...
} else {
runqueue_t *this_rq = cpu_rq(this_cpu);

/*
* Not the local CPU - must adjust timestamp. This should
* get optimised away in the !CONFIG_SMP case.
*/
p->timestamp = (p->timestamp - this_rq->timestamp_last_tick)
+ rq->timestamp_last_tick;
__activate_task(p, rq);
if (TASK_PREEMPTS_CURR(p, rq))
resched_task(rq->curr);

current->sleep_avg = JIFFIES_TO_NS(CURRENT_BONUS(current) *
PARENT_PENALTY / 100 * MAX_SLEEP_AVG / MAX_BONUS);
schedstat_inc(rq, wunt_moved);
}

if (unlikely(cpu != this_cpu)) {
task_rq_unlock(rq, &flags);
rq = task_rq_lock(current, &flags);
}
current->sleep_avg = JIFFIES_TO_NS(CURRENT_BONUS(current) *
PARENT_PENALTY / 100 * MAX_SLEEP_AVG / MAX_BONUS);
task_rq_unlock(rq, &flags);
}

So, first off, the statements under "if (unlikely(cpu != this_cpu))" can
be folded into the previous block, since that's under the same test.
Secondly, why is sleep_avg being set twice to the same thing, and why
are we happy to adjust it the first time without holding the rq lock for
current, but the second time we make sure we are holding the rq lock?
recalc_task_prio seems happy to adjust a tasks ->sleep_avg without
holding any lock at all...

Rusty.
--
Anyone who quotes me in their signature is an idiot -- Rusty Russell

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/