Re: [RFC V2 0/9] x86/mmu:Introduce parallel memory virtualization to boost performance

From: yulei zhang
Date: Mon Sep 28 2020 - 07:53:11 EST


On Sat, Sep 26, 2020 at 4:50 AM Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote:
>
> On 25/09/20 19:30, Ben Gardon wrote:
> > Oh, thank you for explaining that. I didn't realize the goal here was
> > to improve LM performance. I was under the impression that this was to
> > give VMs a better experience on startup for fast scaling or something.
> > In your testing with live migration how has this affected the
> > distribution of time between the phases of live migration? Just for
> > terminology (since I'm not sure how standard it is across the
> > industry) I think of a live migration as consisting of 3 stages:
> > precopy, blackout, and postcopy. In precopy we're tracking the VM's
> > working set via dirty logging and sending the contents of its memory
> > to the target host. In blackout we pause the vCPUs on the source, copy
> > minimal data to the target, and resume the vCPUs on the target. In
> > postcopy we may still have some pages that have not been copied to the
> > target and so request those in response to vCPU page faults via user
> > fault fd or some other mechanism.
> >
> > Does EPT pre-population preclude the use of a postcopy phase?
>
> I think so.
>
> As a quick recap, turn postcopy migration handles two kinds of
> pages---they can be copied to the destination either in background
> (stuff that was dirty when userspace decided to transition to the
> blackout phase) or on-demand (relayed from KVM to userspace via
> get_user_pages and userfaultfd). Normally only on-demand pages would be
> served through userfaultfd, while with prepopulation every missing page
> would be faulted in from the kernel through userfaultfd. In practice
> this would just extend the blackout phase.
>
> Paolo
>

Yep, you are right, based on current implementation it doesn't support the
postcopy. Thanks for the suggestion, we will try to fill the gap with proper
EPT population during the post-copy.

> > I would
> > expect that to make the blackout phase really long. Has that not been
> > a problem for you?
> >
> > I love the idea of partial EPT pre-population during precopy if you
> > could still handle postcopy and just pre-populate as memory came in.
> >
>