Re: 2.6.21-rc7-mm1: BUG_ON in kthread_bind during _cpu_down

From: Eric W. Biederman
Date: Thu Apr 26 2007 - 06:23:03 EST


Gautham R Shenoy <ego@xxxxxxxxxx> writes:

> On Wed, Apr 25, 2007 at 04:54:10PM -0700, Andrew Morton wrote:
>> On Thu, 26 Apr 2007 01:10:21 +0200 "Rafael J. Wysocki" <rjw@xxxxxxx> wrote:
>>
>> > Hi,
>> >
>> > The BUG_ON in khthread_bind (line 165 in kthread.c) triggers for me during
>> > attempted suspend to disk, when disable_nonboot_cpus() calls _cpu_down()
>> > (on x86_64).
>>
> Caused due to Oleg's patch http://lkml.org/lkml/2007/4/13/93.
>
> Agreed that most of the time a kthread_create(p) is followed by a
> kthread_bind(p), in which case the assertion
> WARN_ON(p->state != TASK_UNINTERRUPTIBLE) makes sense.
>
> But, in cpu hotplug case, we need to rebind the stop_machine_run thread
> from the cpu which has just been offlined to any online cpu.
> (kernel/cpu.c line 180)
> At this point, the thread would be in TASK_INTERRUPTIBLE waiting for us
> to call a kthread_stop on it.(kernel/kthread.c line 161)
>
> We only need to ensure in kthread_bind that the task which is being
> bound is not running or exiting. Doesn't matter if it's sleeping in
> TASK_INTERRUPTIBLE or TASK_UNINTERRUPTIBLE state.

That will probably handle this problem.

However there is a weird interaction with process freezer.

The process freezer can come in and wake up a kernel thread
to encourage it to call try_to_freeze_process while it is
waiting to be bound.

How do we handle that evil race?

Eric


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/