On Wed, 2010-06-16 at 08:25 -0700, Dave Hansen wrote:
On Wed, 2010-06-16 at 12:24 +0300, Avi Kivity wrote:The shrink _query_ (mmu_shrink() with nr_to_scan=0) code is called
On 06/15/2010 04:55 PM, Dave Hansen wrote:This is probably something that we need to go back and actually measure.
In a previous patch, we removed the 'nr_to_scan' tracking.That also increases the latency hit, as well as a potential fault storm,
It was not being used to track the number of objects
scanned, so we stopped using it entirely. Here, we
strart using it again.
The theory here is simple; if we already have the refcount
and the kvm->mmu_lock, then we should do as much work as
possible under the lock. The downside is that we're less
fair about the KVM instances from which we reclaim. Each
call to mmu_shrink() will tend to "pick on" one instance,
after which it gets moved to the end of the list and left
alone for a while.
on that instance. Spreading out is less efficient, but smoother.
My suspicion is that, when memory fills up and this shrinker is getting
called a lot, it will be naturally fair. That list gets shuffled around
enough, and mmu_shrink() called often enough that no VMs get picked on
too unfairly.
I'll go back and see if I can quantify this a bit, though.
really, really often. Like 5,000-10,000 times a second during lots of
VM pressure. But, it's almost never called on to actually shrink
anything.
Over the 20 minutes or so that I tested, I saw about 700k calls to
mmu_shrink(). But, only 6 (yes, six) calls that had a non-zero
nr_to_scan. I'm not sure whether this is because of the .seeks argument
to the shrinker or what, but the slab code stays far, far away from
making mmu_shrink() do much real work.
That changes a few things. I bet all the contention we were seeing was
just from nr_to_scan=0 calls and not from actual shrink operations.
Perhaps we should just stop this set after patch 4.