Re: [PATCH RFC 04/15] KVM: Implement ring-based dirty memory tracking

From: Peter Xu
Date: Mon Dec 16 2019 - 10:26:53 EST


On Mon, Dec 16, 2019 at 10:29:36AM +0100, Paolo Bonzini wrote:
> On 14/12/19 17:26, Peter Xu wrote:
> > On Sat, Dec 14, 2019 at 08:57:26AM +0100, Paolo Bonzini wrote:
> >> On 13/12/19 21:23, Peter Xu wrote:
> >>>> What is the benefit of using u16 for that? That means with 4K pages, you
> >>>> can share at most 256M of dirty memory each time? That seems low to me,
> >>>> especially since it's sufficient to touch one byte in a page to dirty it.
> >>>>
> >>>> Actually, this is not consistent with the definition in the code ;-)
> >>>> So I'll assume it's actually u32.
> >>> Yes it's u32 now. Actually I believe at least Paolo would prefer u16
> >>> more. :)
> >>
> >> It has to be u16, because it overlaps the padding of the first entry.
> >
> > Hmm, could you explain?
> >
> > Note that here what Christophe commented is on dirty_index,
> > reset_index of "struct kvm_dirty_ring", so imho it could really be
> > anything we want as long as it can store a u32 (which is the size of
> > the elements in kvm_dirty_ring_indexes).
> >
> > If you were instead talking about the previous union definition of
> > "struct kvm_dirty_gfns" rather than "struct kvm_dirty_ring", iiuc I've
> > moved those indices out of it and defined kvm_dirty_ring_indexes which
> > we expose via kvm_run, so we don't have that limitation as well any
> > more?
>
> Yeah, I meant that since the size has (had) to be u16 in the union, it
> need not be bigger in kvm_dirty_ring.
>
> I don't think having more than 2^16 entries in the *per-CPU* ring buffer
> makes sense; lagging in recording dirty memory by more than 256 MiB per
> CPU would mean a large pause later on resetting the ring buffers (your
> KVM_CLEAR_DIRTY_LOG patches found the sweet spot to be around 1 GiB for
> the whole system).

That's right, 1G could probably be a "common flavor" for guests in
that case.

Though I wanted to use u64 only because I wanted to prepare even
better for future potential changes as long as it won't hurt much.
Here I'm just afraid 16bit might not be big enough for this 64bit
world, at the meantime I'd confess some of the requirement could be
really unimaginable before we know it.. I'm trying to forge one here:
what if the customer wants to handle 4G burst dirtying workload during
a migration (besides the burst IOs, mostly idle guests), while the
customer also want good responsiveness during the burst dirtying? In
that case even if we use 256MiB ring we'll still need to freqently
pause for the harvesting, but actually this case really suites for a
8G ring size.

My example could be nonsense actually, just to show that if we can
extend something to u64 from u16 without paying much, then why not. :-)

>
> So I liked the union, but if you removed it you might as well align the
> producer and consumer indices to 64 bytes so that they are in separate
> cache lines.

Yeh that I can do. Thanks,

--
Peter Xu