CPU isolation and workqueues [was Re: [CPUISOL] CPU isolation extensions]

From: Max Krasnyanskiy
Date: Mon Feb 04 2008 - 19:33:26 EST



Peter Zijlstra wrote:
On Mon, 2008-01-28 at 14:00 -0500, Steven Rostedt wrote:
On Mon, 28 Jan 2008, Max Krasnyanskiy wrote:
[PATCH] [CPUISOL] Support for workqueue isolation
The thing about workqueues is that they should only be woken on a CPU if
something on that CPU accessed them. IOW, the workqueue on a CPU handles
work that was called by something on that CPU. Which means that
something that high prio task did triggered a workqueue to do some work.
But this can also be triggered by interrupts, so by keeping interrupts
off the CPU no workqueue should be activated.
No no no. That's what I though too ;-). The problem is that things like NFS and friends
expect _all_ their workqueue threads to report back when they do certain things like
flushing buffers and stuff. The reason I added this is because my machines were getting
stuck because CPU0 was waiting for CPU1 to run NFS work queue threads even though no IRQs
or other things are running on it.
This sounds more like we should fix NFS than add this for all workqueues.
Again, we want workqueues to run on the behalf of whatever is running on
that CPU, including those tasks that are running on an isolcpu.

agreed, by looking at my top output (and not the nfs code) it looks like
it just spawns a configurable number of active kernel threads which are
not cpu bound by in any way. I think just removing the isolated cpus
from their runnable mask should take care of them.

Peter, Steven,

I think I convinced you guys last time but I did not have a convincing example. So here is some
more info on why workqueues need to be aware of isolated cpus.

Here is how a work queue gets flushed.

static int flush_cpu_workqueue(struct cpu_workqueue_struct *cwq)
{
int active;

if (cwq->thread == current) {
/*
* Probably keventd trying to flush its own queue. So simply run
* it by hand rather than deadlocking.
*/
run_workqueue(cwq);
active = 1;
} else {
struct wq_barrier barr;

active = 0;
spin_lock_irq(&cwq->lock);
if (!list_empty(&cwq->worklist) || cwq->current_work != NULL) {
insert_wq_barrier(cwq, &barr, 1);
active = 1;
}
spin_unlock_irq(&cwq->lock);

if (active)
wait_for_completion(&barr.done);
}

return active;
}

void fastcall flush_workqueue(struct workqueue_struct *wq)
{
const cpumask_t *cpu_map = wq_cpu_map(wq);
int cpu;

might_sleep();
lock_acquire(&wq->lockdep_map, 0, 0, 0, 2, _THIS_IP_);
lock_release(&wq->lockdep_map, 1, _THIS_IP_);
for_each_cpu_mask(cpu, *cpu_map)
flush_cpu_workqueue(per_cpu_ptr(wq->cpu_wq, cpu));
}

In other words it schedules some work on each cpu and expects workqueue thread to run and trigger the completion. This is what I meant that _all_ threads are expected to report back even if there is nothing running on that CPU.

So my patch simply makes sure that isolated CPUs are ignored (if work queue isolation is enabled)
that work queue threads are not started on isolated in the CPUs that are isolated.

Max
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/