Re: Help needed: Resume problems in 2.6.32-rc, perhaps related topreempt_count leakage in keventd

From: Linus Torvalds
Date: Mon Nov 09 2009 - 10:58:35 EST




On Mon, 9 Nov 2009, Thomas Gleixner wrote:
>
> I think it _IS_ releated because the worker_thread is CPU affine and
> the debug_smp_processor_id() check does:

Hmm. We do know CPU affinity is destroyed by CPU hotplug. People have
complained about that before (for user-space processes that get moved
around due to hot-unplug/plug).

And the suspend/resume process does CPU hotplug to take down all but one
CPU. The workqueues should act on those events already, but maybe there's
a bug somewhere. None of that is new to 32-rc, though - is it?

And workqueue_cpu_callback() seems buggy. It loops over the 'workqueues'
list with no protection. Yes, we do 'stop_machine' for CPU hotplug events,
but only for the very internal one (CPU_DYING) will the CPU notifiers be
called with the machine stopped).

Hmm. I don't see any changes in kernel/cpu.c or kernel/workqueue.c that
look at all relevant. But scheduler changes could certainly matter.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/