On Mon, 2008-01-28 at 14:00 -0500, Steven Rostedt wrote:On Mon, 28 Jan 2008, Max Krasnyanskiy wrote:This sounds more like we should fix NFS than add this for all workqueues.No no no. That's what I though too ;-). The problem is that things like NFS and friends[PATCH] [CPUISOL] Support for workqueue isolationThe thing about workqueues is that they should only be woken on a CPU if
something on that CPU accessed them. IOW, the workqueue on a CPU handles
work that was called by something on that CPU. Which means that
something that high prio task did triggered a workqueue to do some work.
But this can also be triggered by interrupts, so by keeping interrupts
off the CPU no workqueue should be activated.
expect _all_ their workqueue threads to report back when they do certain things like
flushing buffers and stuff. The reason I added this is because my machines were getting
stuck because CPU0 was waiting for CPU1 to run NFS work queue threads even though no IRQs
or other things are running on it.
Again, we want workqueues to run on the behalf of whatever is running on
that CPU, including those tasks that are running on an isolcpu.
agreed, by looking at my top output (and not the nfs code) it looks like
it just spawns a configurable number of active kernel threads which are
not cpu bound by in any way. I think just removing the isolated cpus
from their runnable mask should take care of them.
Well, that's something one of the greater powers (Linus, Andrew, Ingo)I agree in general. The thing is though that stop machine just kills any kind of latency[PATCH] [CPUISOL] Isolated CPUs should be ignored by the "stop machine"This I find very dangerous. We are making an assumption that tasks on an
isolated CPU wont be doing things that stopmachine requires. What stops
a task on an isolated CPU from calling something into the kernel that
stop_machine requires to halt?
guaranties. Without the patch the machine just hangs waiting for the stop-machine to run
when module is inserted/removed. And running without dynamic module loading is not very
practical on general purpose machines. So I'd rather have an option with a big red warning
than no option at all :).
must decide. ;-)
I'm in favour of better engineered method, that is, we really should try
to solve these problems in a proper way. Hacks like this might be fine
for custom kernels, but I think we should have a higher standard when it
comes to upstream - we all have to live many years with whatever we put
in there, we'd better think well about it.