Re: [PATCH v3 00/15] KVM: MMU: fast zap all shadow pages

From: Gleb Natapov
Date: Tue Apr 23 2013 - 03:33:25 EST

Next message: David Miller: "Re: sparc64, mm BUG in 3.9-rc8"
Previous message: Jean Delvare: "Re: memcpy_fromio in dmi_scan.c"
In reply to: Xiao Guangrong: "Re: [PATCH v3 00/15] KVM: MMU: fast zap all shadow pages"
Next in thread: Marcelo Tosatti: "Re: [PATCH v3 00/15] KVM: MMU: fast zap all shadow pages"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Tue, Apr 23, 2013 at 03:20:28PM +0800, Xiao Guangrong wrote:
> On 04/23/2013 02:28 PM, Gleb Natapov wrote:
> > On Tue, Apr 23, 2013 at 08:19:02AM +0800, Xiao Guangrong wrote:
> >> On 04/22/2013 05:21 PM, Gleb Natapov wrote:
> >>> On Sun, Apr 21, 2013 at 10:09:29PM +0800, Xiao Guangrong wrote:
> >>>> On 04/21/2013 09:03 PM, Gleb Natapov wrote:
> >>>>> On Tue, Apr 16, 2013 at 02:32:38PM +0800, Xiao Guangrong wrote:
> >>>>>> This patchset is based on my previous two patchset:
> >>>>>> [PATCH 0/2] KVM: x86: avoid potential soft lockup and unneeded mmu reload
> >>>>>> (https://lkml.org/lkml/2013/4/1/2)
> >>>>>>
> >>>>>> [PATCH v2 0/6] KVM: MMU: fast invalid all mmio sptes
> >>>>>> (https://lkml.org/lkml/2013/4/1/134)
> >>>>>>
> >>>>>> Changlog:
> >>>>>> V3:
> >>>>>> completely redesign the algorithm, please see below.
> >>>>>>
> >>>>> This looks pretty complicated. Is it still needed in order to avoid soft
> >>>>> lockups after "avoid potential soft lockup and unneeded mmu reload" patch?
> >>>>
> >>>> Yes.
> >>>>
> >>>> I discussed this point with Marcelo:
> >>>>
> >>>> ======
> >>>> BTW, to my honest, i do not think spin_needbreak is a good way - it does
> >>>> not fix the hot-lock contention and it just occupies more cpu time to avoid
> >>>> possible soft lock-ups.
> >>>>
> >>>> Especially, zap-all-shadow-pages can let other vcpus fault and vcpus contest
> >>>> mmu-lock, then zap-all-shadow-pages release mmu-lock and wait, other vcpus
> >>>> create page tables again. zap-all-shadow-page need long time to be finished,
> >>>> the worst case is, it can not completed forever on intensive vcpu and memory
> >>>> usage.
> >>>>
> >>> So what about mixed approach: use generation numbers and reload roots to
> >>> quickly invalidate all shadow pages and then do kvm_mmu_zap_all_invalid().
> >>> kvm_mmu_zap_all_invalid() is a new function that invalidates only shadow
> >>> pages with stale generation number (and uses lock break technique). It
> >>> may traverse active_mmu_pages from tail to head since new shadow pages
> >>> will be added to the head of the list or it may use invalid slot rmap to
> >>> find exactly what should be invalidated.
> >>
> >> I prefer to unmapping the invalid rmap instead of zapping stale shadow pages
> >> in kvm_mmu_zap_all_invalid(), the former is faster.
> >>
> > Not sure what do you mean here. What is "unmapping the invalid rmap"?
>
> it is like you said below:
> ======
> kvm_mmu_zap_all_invalid(slot) will only zap shadow pages that are
> reachable from the slot's rmap
> ======
> My suggestion is zapping the spte that are linked in the slot's rmap.
>
OK, so we are on the same page.

> >
> >> This way may help but not good, after reload mmu with the new generation number,
> >> all of the vcpu will fault in a long time, try to hold mmu-lock is not good even
> >> if use lock break technique.
> > If kvm_mmu_zap_all_invalid(slot) will only zap shadow pages that are
> > reachable from the slot's rmap, as opposite to zapping all invalid
> > shadow pages, it will have much less work to do. The slots that we
> > add/remove during hot plug are usually small. To guaranty reasonable
> > forward progress we can break the lock only after certain amount of
> > shadow pages are invalidated. All other invalid shadow pages will be
> > zapped in make_mmu_pages_available() and zapping will be spread between
> > page faults.
>
> No interested in hot-remove memory?
>
I am, good point. Still, I think, that with guarantied forward progress the
slot removal time should be bound to something reasonable. At least we
should have evidence of the contrary before optimizing for it. Hot
memory remove is not instantaneous from guest point of view either,
guest needs to move memory around to make it possible.

> BTW, could you please review my previous patchsets and apply them if its
> looks ok? ;)
>
I need Marcelo's acks on them too :)

> [PATCH 0/2] KVM: x86: avoid potential soft lockup and unneeded mmu reload
> (https://lkml.org/lkml/2013/4/1/2)
>
But you yourself saying that with this patch slot remove may never
complete with high memory usage workload.

> [PATCH v2 0/6] KVM: MMU: fast invalid all mmio sptes
> (https://lkml.org/lkml/2013/4/1/134)
>
Missed this one. Will review.

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: David Miller: "Re: sparc64, mm BUG in 3.9-rc8"
Previous message: Jean Delvare: "Re: memcpy_fromio in dmi_scan.c"
In reply to: Xiao Guangrong: "Re: [PATCH v3 00/15] KVM: MMU: fast zap all shadow pages"
Next in thread: Marcelo Tosatti: "Re: [PATCH v3 00/15] KVM: MMU: fast zap all shadow pages"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]