Re: 2.6.21-rc7-mm1: BUG_ON in kthread_bind during _cpu_down

From: Oleg Nesterov
Date: Thu Apr 26 2007 - 07:11:23 EST


On 04/26, Gautham R Shenoy wrote:
>
> On Wed, Apr 25, 2007 at 04:54:10PM -0700, Andrew Morton wrote:
> > On Thu, 26 Apr 2007 01:10:21 +0200 "Rafael J. Wysocki" <rjw@xxxxxxx> wrote:
> >
> > > Hi,
> > >
> > > The BUG_ON in khthread_bind (line 165 in kthread.c) triggers for me during
> > > attempted suspend to disk, when disable_nonboot_cpus() calls _cpu_down()
> > > (on x86_64).
> >
> Caused due to Oleg's patch http://lkml.org/lkml/2007/4/13/93.
>
> Agreed that most of the time a kthread_create(p) is followed by a
> kthread_bind(p), in which case the assertion
> WARN_ON(p->state != TASK_UNINTERRUPTIBLE) makes sense.
>
> But, in cpu hotplug case, we need to rebind the stop_machine_run thread
> from the cpu which has just been offlined to any online cpu.
> (kernel/cpu.c line 180)

I can't understand why do we need to re-bind this thread. We are doing
kthread_stop()->wake_up() below, at this point move_task_off_dead_cpu()
has already cared about this task, no?

> We only need to ensure in kthread_bind that the task which is being
> bound is not running or exiting. Doesn't matter if it's sleeping in
> TASK_INTERRUPTIBLE or TASK_UNINTERRUPTIBLE state.

We need to ensure that this task can't be woken after return from
wait_task_inactive(k), otherwise set_task_cpu() after that is not safe.

TASK_INTERRUPTIBLE doesn't protect us from freezing.

Couldn't we just remove this kthread_bind() ?

Oleg.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/