Re: [BUG 2.6.25-rc3] scheduler/hotplug: some processes aredealocked when cpu is set to offline

From: Yi Yang
Date: Mon Mar 03 2008 - 21:32:33 EST


On Mon, 2008-03-03 at 21:01 +0530, Gautham R Shenoy wrote:
> > This issue seems such one, but i tried to change it to follow this rule but
> > the issue is still there.
> >
> > Why isn't the kernel thread [watchdog/1] reaped by its parent? its state
> > is TASK_RUNNING with high priority (R< means this), why it isn't done?
> >
> > Anyone ever met such a problem? Your thought?
>
> Hi Yi,
>
> This is indeed strange. I am able to reproduce this problem on my 4-way
> box. From what I see in the past two runs, we're waiting in the
> cpu-hotplug callback path for the watchdog/1 thread to stop.
>
> During cpu-offline, once the cpu goes offline, in the migration_call(),
> we migrate any tasks associated with the offline cpus
> to some other cpu. This also mean breaking affinity for tasks which were
> affined to the cpu which went down. So watchdog/1 has been migrated to
> some other cpu.
No, [watchdog/1] is just for CPU #1, if CPU #1 has been offline, it
should be killed but not migrated to other CPU because other CPU has
such a kthread.

Maybe migration_call was doing such a bad thing. :-)
>
> However, it remains in R< state and has not executed the
> kthread_should_stop() instruction.
>
> I'm trying to probe further by inserting a few more printk's in there.
>
> Will post the findings in a couple of hours.
>
> Thanks for reporting the problem.
>
> Regards
> gautham.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/