Re: [PATCH 3/3] sched: Implement interface for cgroup unified hierarchy

From: Austin S Hemmelgarn
Date: Mon Aug 24 2015 - 16:01:04 EST

On 2015-08-24 13:04, Tejun Heo wrote:
Hello, Austin.

On Mon, Aug 24, 2015 at 11:47:02AM -0400, Austin S Hemmelgarn wrote:
Just to learn more, what sort of hypervisor support threads are we
talking about? They would have to consume considerable amount of cpu
cycles for problems like this to be relevant and be dynamic in numbers
in a way which letting them competing against vcpus makes sense. Do
IO helpers meet these criteria?

Depending on the configuration, yes they can. VirtualBox has some rather
CPU intensive threads that aren't vCPU threads (their emulated APIC thread
immediately comes to mind), and so does QEMU depending on the emulated

And the number of those threads fluctuate widely and dynamically?
It depends, usually there isn't dynamic fluctuation unless there is a lot of hot[un]plugging of virtual devices going on (which can be the case for situations with tight host/guest integration), but the number of threads can vary widely between configurations (most of the VM's I run under QEMU have about 16 threads on average, but I've seen instances with more than 100 threads). The most likely case to cause wide and dynamic fluctuations of threads would be systems set up to dynamically hot[un]plug vCPU's based on system load (such systems have other issues to contend with also, but they do exist).
hardware configuration (it gets more noticeable when the disk images are
stored on a SAN and served through iSCSI, NBD, FCoE, or ATAoE, which is
pretty typical usage for large virtualization deployments). I've seen cases
first hand where the vCPU's can make no reasonable progress because they are
constantly getting crowded out by other threads.

That alone doesn't require hierarchical resource distribution tho.
Setting nice levels reasonably is likely to alleviate most of the
In the cases I've dealt with this myself, nice levels didn't cut it, and I had to resort to SCHED_RR with particular care to avoid priority inversions.
The use of the term 'hypervisor support threads' for this is probably not
the best way of describing the contention, as it's almost always a full
system virtualization issue, and the contending threads are usually storage
back-end access threads.

I would argue that there are better ways to deal properly with this (Isolate
the non vCPU threads on separate physical CPU's from the hardware emulation
threads), but such methods require large systems to be practical at any
scale, and many people don't have the budget for such large systems, and
this way of doing things is much more flexible for small scale use cases
(for example, someone running one or two VM's on a laptop under QEMU or

I don't know. "Someone running one or two VM's on a laptop under
QEMU" doesn't really sound like the use case which absolutely requires
hierarchical cpu cycle distribution.
It depends on the use case. I never have more than 2 VM's running on my laptop (always under QEMU, setting up Xen is kind of pointless ona quad core system with only 8G of RAM), and I take extensive advantage of the cpu cgroup to partition resources among various services on the host.

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature