Re: [PATCH v3 00/21] KVM: Dirty ring interface

From: Michael S. Tsirkin
Date: Thu Jan 09 2020 - 14:20:15 EST


On Thu, Jan 09, 2020 at 12:58:08PM -0500, Peter Xu wrote:
> On Thu, Jan 09, 2020 at 09:47:11AM -0700, Alex Williamson wrote:
> > On Thu, 9 Jan 2020 09:57:08 -0500
> > Peter Xu <peterx@xxxxxxxxxx> wrote:
> >
> > > Branch is here: https://github.com/xzpeter/linux/tree/kvm-dirty-ring
> > > (based on kvm/queue)
> > >
> > > Please refer to either the previous cover letters, or documentation
> > > update in patch 12 for the big picture. Previous posts:
> > >
> > > V1: https://lore.kernel.org/kvm/20191129213505.18472-1-peterx@xxxxxxxxxx
> > > V2: https://lore.kernel.org/kvm/20191221014938.58831-1-peterx@xxxxxxxxxx
> > >
> > > The major change in V3 is that we dropped the whole waitqueue and the
> > > global lock. With that, we have clean per-vcpu ring and no default
> > > ring any more. The two kvmgt refactoring patches were also included
> > > to show the dependency of the works.
> >
> > Hi Peter,
>
> Hi, Alex,
>
> >
> > Would you recommend this style of interface for vfio dirty page
> > tracking as well? This mechanism seems very tuned to sparse page
> > dirtying, how well does it handle fully dirty, or even significantly
> > dirty regions?
>
> That's truely the point why I think the dirty bitmap can still be used
> and should be kept. IIUC the dirty ring starts from COLO where (1)
> dirty rate is very low, and (2) sync happens frequently. That's a
> perfect ground for dirty ring. However it for sure does not mean that
> dirty ring can solve all the issues. As you said, I believe the full
> dirty is another extreme in that dirty bitmap could perform better.
>
> > We also don't really have "active" dirty page tracking
> > in vfio, we simply assume that if a page is pinned or otherwise mapped
> > that it's dirty, so I think we'd constantly be trying to re-populate
> > the dirty ring with pages that we've seen the user consume, which
> > doesn't seem like a good fit versus a bitmap solution. Thanks,
>
> Right, so I confess I don't know whether dirty ring is the ideal
> solutioon for vfio either. Actually if we're tracking by page maps or
> pinnings, then IMHO it also means that it could be more suitable to
> use an modified version of dirty ring buffer (as you suggested in the
> other thread), in that we can track dirty using (addr, len) range
> rather than a single page address. That could be hard for KVM because
> in KVM the page will be mostly trapped in 4K granularity in page
> faults, and it'll also be hard to merge continuous entries with
> previous ones because the userspace could be reading the entries (so
> after we publish the previous 4K dirty page, we should not modify the
> entry any more).

An easy way would be to keep a couple of entries around, not pushing
them into the ring until later. In fact deferring queue write until
there's a bunch of data to be pushed is a very handy optimization.

When building UAPI's it makes sense to try and keep them generic
rather than tying them to a given implementation.

That's one of the reasons I called for using something
resembling vring_packed_desc.


> VFIO should not have this restriction because the
> marking of dirty page range can be atomic when the range of pages are
> mapped or pinned.
>
> Thanks,
>
> --
> Peter Xu