Re: [PATCH] stack and rcu interaction bug in smp_call_function_mask()

From: Jeremy Fitzhardinge
Date: Mon Aug 11 2008 - 14:28:35 EST


Nick Piggin wrote:
Well that's implemented with the optimized call-single code of course,
so it could be used to implement the masked calls...

I had wanted to look into finding a good cutoff point and use the
percpu queues for light weight masks, and the single global queue for
larger ones.

Queue per cpu is not going to be perfect, though. In the current
implementation, you would need a lot of data structures. You could
alleviate this problem by using per CPU vectors rather than lists,
but then you get the added problem of resource starvation at the
remote end too.

For heavy weight masks on large systems, the single queue I'd say
will be a win. But I never did detailed measurements, so I'm open
to be proven wrong.

Yeah, there's a lot of parameters there. And as I've mentioned before, I wonder whether we should take NUMA topology into account when deciding where and when to use queues. My intuition is that most cross-cpu calls are going to be within cpus on a node, on the grounds that most are mm->cpu_vm_mask calls, and the rest of the system tries hard to co-locate processes sharing memory on one node.

Waffle, handwave.

J
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/