On 01/04/2012 06:47 PM, Rik van Riel wrote:So it looks like the default is optimal, at least wrt the cases you
tested and your test workload.
It depends on the workload.
I believe ebizzy synchronously bounces messages around between
userland threads, and may benefit from lower latency preemption
and re-scheduling.
Workloads like AMQP do asynchronous messaging, and are likely
to benefit from having a lower number of switches.
I do not know which kind of workload is more prevalent.
Another worry with gang scheduling is scalability. One of
the reasons Linux scales well to larger systems is that a
lot of things are done CPU local, without communicating
things with other CPUs. Making the scheduling algorithm
system-global has the potential to add in a lot of overhead.
Likewise, removing the ability to migrate workloads to idle
CPUs is likely to hurt a lot of real world workloads.
Benchmarks don't care, because they run full-out. However,
users do not run benchmarks nearly as much as they run
actual workloads...
I think we can solve it at the guest level. The paravirt ticketlock
stuff introduces wait/wake calls (actually wait is just a HLT
instruction); we could spin for a while, then HLT until the other side
wakes us. We should do this for all sites that busy wait.