On Sat, 2018-07-28 at 19:57 -0700, Andy Lutomirski wrote:On Sat, Jul 28, 2018 at 2:53 PM, Rik van Riel <riel@xxxxxxxxxxx>
wrote:
Introduce a variant of on_each_cpu_cond that iterates only over the
CPUs in a cpumask, in order to avoid making callbacks for every
single
CPU in the system when we only need to test a subset.
Nice.
Although, if you want to be really fancy, you could optimize this (or
add a variant) that does the callback on the local CPU in parallel
with the remote ones. That would give a small boost to TLB flushes.
The test_func callbacks are not run remotely, but onthe local CPU, before deciding who to send callbacksto.The actual IPIs are sent in parallel, if the cpumaskallocation succeeds (it always should in many kernelconfigurations, and almost always in the rest).
What I meant is that on_each_cpu_mask does:
smp_call_function_many(mask, func, info, wait);
if (cpumask_test_cpu(cpu, mask)) {
unsigned long flags;
local_irq_save(flags);
func(info);
local_irq_restore(flags);
}
So it IPIs all the remote CPUs in parallel, then waits, then does the local work. In principle, the local flush could be done after triggering the IPIs but before they all finish.
-- All Rights Reversed.
|