And I think what's proposed is:
1. Change smp_call_function() semantics, to run given function
on _all_ CPUs (thus getting rid of the on_each_cpu() "mistake")
2. Resort to (most probably implement another function?) using
smp_call_function_mask() or flags appropriately to also serve
the use cases where we need to run a given function on all
_other_ CPUs
Does this pointless/gratuitous code-churn really make sense?
Definitely not to me ...
[ For the _single() case we now have on_cpu() as you originally
proposed, which I definitely like and fills the other gap in the API. ]
So I still don't quite understand what is the need to change existing
semantics of smp_call_function{_single} in the first place.